Making Non-Stochastic Control (Almost) as Easy as Stochastic
Abstract
Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem where a learner attempts to optimally control an unknown linear dynamical system with fully observed state, perturbed by i.i.d. Gaussian noise. \iftoggle{nips}TheIt is now understood that the optimal regret over time horizon $T$ against the optimal control law scales as $\widetilde{\Theta}(\sqrt{T})$. In this paper, we show that the same regret rate (against a suitable benchmark) is attainable even in the considerably more general non-stochastic control model, where the system is driven by \emph{arbitrary adversarial} noise \citep{agarwal2019online}. We attain the optimal $\widetilde{\mathcal{O}}(\sqrt{T})$ regret when the dynamics are unknown to the learner, and $\mathrm{poly}(\log T)$ regret when known, provided that the cost functions are strongly convex (as in LQR). Our algorithm is based on a novel variant of online Newton step \citep{hazan2007logarithmic}, which adapts to the geometry induced by adversarial disturbances, and our analysis hinges on generic regret bounds for certain structured losses in the OCO-with-memory framework \citep{anava2015online}.
Cite
Text
Simchowitz. "Making Non-Stochastic Control (Almost) as Easy as Stochastic." Neural Information Processing Systems, 2020.Markdown
[Simchowitz. "Making Non-Stochastic Control (Almost) as Easy as Stochastic." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/simchowitz2020neurips-making/)BibTeX
@inproceedings{simchowitz2020neurips-making,
title = {{Making Non-Stochastic Control (Almost) as Easy as Stochastic}},
author = {Simchowitz, Max},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/simchowitz2020neurips-making/}
}