Stabilizing Dynamical Systems via Policy Gradient Methods
Abstract
Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering. In this paper, we provide a simple, model-free algorithm for stabilizing fully observed dynamical systems. While model-free methods have become increasingly popular in practice due to their simplicity and flexibility, stabilization via direct policy search has received surprisingly little attention. Our algorithm proceeds by solving a series of discounted LQR problems, where the discount factor is gradually increased. We prove that this method efficiently recovers a stabilizing controller for linear systems, and for smooth, nonlinear systems within a neighborhood of their equilibria. Our approach overcomes a significant limitation of prior work, namely the need for a pre-given stabilizing control policy. We empirically evaluate the effectiveness of our approach on common control benchmarks.
Cite
Text
Perdomo et al. "Stabilizing Dynamical Systems via Policy Gradient Methods." Neural Information Processing Systems, 2021.Markdown
[Perdomo et al. "Stabilizing Dynamical Systems via Policy Gradient Methods." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/perdomo2021neurips-stabilizing/)BibTeX
@inproceedings{perdomo2021neurips-stabilizing,
title = {{Stabilizing Dynamical Systems via Policy Gradient Methods}},
author = {Perdomo, Juan and Umenberger, Jack and Simchowitz, Max},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/perdomo2021neurips-stabilizing/}
}