Adaptive Variants of Optimal Feedback Policies
Abstract
The stable combination of optimal feedback policies with online learning is studied in a new control-theoretic framework for uncertain nonlinear systems. The framework can be systematically used in transfer learning and sim-to-real applications, where an optimal policy learned for a nominal system needs to remain effective in the presence of significant variations in parameters. Given unknown parameters within a bounded range, the resulting adaptive control laws guarantee convergence of the closed-loop system to the state of zero cost. Online adjustment of the learning rate is used as a key stability mechanism, and preserves certainty equivalence when designing optimal policies. The approach is illustrated on the familiar mountain car problem, where it yields near-optimal performance despite the presence of parametric model uncertainty.
Cite
Text
Lopez and Slotine. "Adaptive Variants of Optimal Feedback Policies." Proceedings of The 4th Annual Learning for Dynamics and Control Conference, 2022.Markdown
[Lopez and Slotine. "Adaptive Variants of Optimal Feedback Policies." Proceedings of The 4th Annual Learning for Dynamics and Control Conference, 2022.](https://mlanthology.org/l4dc/2022/lopez2022l4dc-adaptive/)BibTeX
@inproceedings{lopez2022l4dc-adaptive,
title = {{Adaptive Variants of Optimal Feedback Policies}},
author = {Lopez, Brett and Slotine, Jean-Jacques},
booktitle = {Proceedings of The 4th Annual Learning for Dynamics and Control Conference},
year = {2022},
pages = {1125-1136},
volume = {168},
url = {https://mlanthology.org/l4dc/2022/lopez2022l4dc-adaptive/}
}