Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Anas Barakat, Pascal Bianchi, Julien Lehmann

AISTATS 2022 pp. 991-1040

/aistats/2022/barakat2022aistats-analysis/

Abstract

Actor-critic methods integrating target networks have exhibited a stupendous empirical success in deep reinforcement learning. However, a theoretical understanding of the use of target networks in actor-critic methods is largely missing in the literature. In this paper, we reduce this gap between theory and practice by proposing the first theoretical analysis of an online target-based actor-critic algorithm with linear function approximation in the discounted reward setting. Our algorithm uses three different timescales: one for the actor and two for the critic. Instead of using the standard single timescale temporal difference (TD) learning algorithm as a critic, we use a two timescales target-based version of TD learning closely inspired from practical actor-critic algorithms implementing target networks. First, we establish asymptotic convergence results for both the critic and the actor under Markovian sampling. Then, we provide a finite-time analysis showing the impact of incorporating a target network into actor-critic methods.

PDF AISTATS Semantic Scholar

Cite

Text

Barakat et al. "Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation." Artificial Intelligence and Statistics, 2022.

Markdown

[Barakat et al. "Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation." Artificial Intelligence and Statistics, 2022.](https://mlanthology.org/aistats/2022/barakat2022aistats-analysis/)

BibTeX

@inproceedings{barakat2022aistats-analysis,
  title     = {{Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation}},
  author    = {Barakat, Anas and Bianchi, Pascal and Lehmann, Julien},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2022},
  pages     = {991-1040},
  volume    = {151},
  url       = {https://mlanthology.org/aistats/2022/barakat2022aistats-analysis/}
}