Toward Understanding Latent Model Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control

Abstract

We study the problem of representation learning for control from partial and potentially high-dimensional observations. We approach this problem via direct latent model learning, where one directly learns a dynamical model in some latent state space by predicting costs. In particular, we establish finite-sample guarantees of finding a near-optimal representation function and a near-optimal controller using the directly learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. A part of our approach to latent model learning closely resembles MuZero, a recent breakthrough in empirical reinforcement learning, in that it learns latent dynamics implicitly by predicting cumulative costs. A key technical contribution of this work is to prove persistency of excitation for a new stochastic process that arises from our analysis of quadratic regression in our approach.

Cite

Text

Tian et al. "Toward Understanding Latent Model Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control." ICML 2023 Workshops: Frontiers4LCD, 2023.

Markdown

[Tian et al. "Toward Understanding Latent Model Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control." ICML 2023 Workshops: Frontiers4LCD, 2023.](https://mlanthology.org/icmlw/2023/tian2023icmlw-understanding/)

BibTeX

@inproceedings{tian2023icmlw-understanding,
  title     = {{Toward Understanding Latent Model Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control}},
  author    = {Tian, Yi and Zhang, Kaiqing and Tedrake, Russ and Sra, Suvrit},
  booktitle = {ICML 2023 Workshops: Frontiers4LCD},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/tian2023icmlw-understanding/}
}