Optimal and Instance-Dependent Guarantees for Markovian Linear Stochastic Approximation

Abstract

We study stochastic approximation procedures for approximately solving a $d$-dimensional linear fixed point equation based on observing a trajectory of length $n$ from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the order $t_{\mathrm{mix}} \tfrac{d}{n}$ on the squared error of the last iterate of a standard scheme, where $t_{\mathrm{mix}}$ is a mixing time. We then prove a non-asymptotic instance-dependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters $(d, t_{\mathrm{mix}})$ in the higher order terms. We complement these upper bounds with a non-asymptotic minimax lower bound that establishes the instance-optimality of the averaged SA estimator. We derive corollaries of these results for policy evaluation with Markov noise—covering the TD($\lambda$) family of algorithms for all $\lambda \in [0, 1)$—and linear autoregressive models. Our instance-dependent characterizations open the door to the design of fine-grained model selection procedures for hyperparameter tuning (e.g., choosing the value of $\lambda$ when running the TD($\lambda$) algorithm).

Cite

Text

Mou et al. "Optimal and Instance-Dependent Guarantees for Markovian Linear Stochastic Approximation." Conference on Learning Theory, 2022.

Markdown

[Mou et al. "Optimal and Instance-Dependent Guarantees for Markovian Linear Stochastic Approximation." Conference on Learning Theory, 2022.](https://mlanthology.org/colt/2022/mou2022colt-optimal/)

BibTeX

@inproceedings{mou2022colt-optimal,
  title     = {{Optimal and Instance-Dependent Guarantees for Markovian Linear Stochastic Approximation}},
  author    = {Mou, Wenlong and Pananjady, Ashwin and Wainwright, Martin and Bartlett, Peter},
  booktitle = {Conference on Learning Theory},
  year      = {2022},
  pages     = {2060-2061},
  volume    = {178},
  url       = {https://mlanthology.org/colt/2022/mou2022colt-optimal/}
}