Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-Free LQR

Abstract

We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a stabilizing controller close to each task-specific optimal controller up to a task-heterogeneity bias in both model-based and model-free learning scenarios. Moreover, in the model-based setting, we show that such a controller is achieved with a linear convergence rate, which improves upon sub-linear rates from existing work. Our theoretical guarantees demonstrate that the learned controller can efficiently adapt to unseen LQR tasks.

Cite

Text

Toso et al. "Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-Free LQR." Proceedings of the 6th Annual Learning for Dynamics & Control Conference, 2024.

Markdown

[Toso et al. "Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-Free LQR." Proceedings of the 6th Annual Learning for Dynamics & Control Conference, 2024.](https://mlanthology.org/l4dc/2024/toso2024l4dc-metalearning/)

BibTeX

@inproceedings{toso2024l4dc-metalearning,
  title     = {{Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-Free LQR}},
  author    = {Toso, Leonardo Felipe and Zhan, Donglin and Anderson, James and Wang, Han},
  booktitle = {Proceedings of the 6th Annual Learning for Dynamics & Control Conference},
  year      = {2024},
  pages     = {902-915},
  volume    = {242},
  url       = {https://mlanthology.org/l4dc/2024/toso2024l4dc-metalearning/}
}