Finite-Time Error Bounds for Linear Stochastic Approximation andTD Learning

Abstract

We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i.e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE). We obtain finite-time bounds on the mean-square error in the case of constant step-size algorithms by considering the drift of an appropriately chosen Lyapunov function. The Lyapunov function can be interpreted either in terms of Stein’s method to obtain bounds on steady-state performance or in terms of Lyapunov stability theory for linear ODEs. We also provide a comprehensive treatment of the moments of the square of the 2-norm of the approximation error. Our analysis yields the following results: (i) for a given step-size, we show that the lower-order moments can be made small as a function of the step-size and can be upper-bounded by the moments of a Gaussian random variable; (ii) we show that the higher-order moments beyond a threshold may be infinite in steady-state; and (iii) we characterize the number of samples needed for the finite-time bounds to be of the same order as the steady-state bounds. As a by-product of our analysis, we also solve the open problem of obtaining finite-time bounds for the performance of temporal difference learning algorithms with linear function approximation and a constant step-size, without requiring a projection step or an i.i.d. noise assumption.

Cite

Text

Srikant and Ying. "Finite-Time Error Bounds for Linear Stochastic Approximation andTD Learning." Conference on Learning Theory, 2019.

Markdown

[Srikant and Ying. "Finite-Time Error Bounds for Linear Stochastic Approximation andTD Learning." Conference on Learning Theory, 2019.](https://mlanthology.org/colt/2019/srikant2019colt-finitetime/)

BibTeX

@inproceedings{srikant2019colt-finitetime,
  title     = {{Finite-Time Error Bounds for Linear Stochastic Approximation andTD Learning}},
  author    = {Srikant, R. and Ying, Lei},
  booktitle = {Conference on Learning Theory},
  year      = {2019},
  pages     = {2803-2830},
  volume    = {99},
  url       = {https://mlanthology.org/colt/2019/srikant2019colt-finitetime/}
}