A Primal-Dual Perspective for Distributed TD-Learning

Lim, Han-Dong; Lee, Donghwan

doi:10.24963/IJCAI.2025/634

A Primal-Dual Perspective for Distributed TD-Learning

Han-Dong Lim, Donghwan Lee

IJCAI 2025 pp. 5698-5706

doi:10.24963/IJCAI.2025/634 /ijcai/2025/lim2025ijcai-primal/

Abstract

The goal of this paper is to investigate distributed temporal difference (TD) learning for a networked multi-agent Markov decision process. The proposed approach is based on distributed optimization algorithms, which can be interpreted as primal-dual ordinary differential equation (ODE) dynamics subject to null-space constraints. Based on the exponential convergence behavior of the primal-dual ODE dynamics subject to null-space constraints, we examine the behavior of the final iterate in various distributed TD-learning scenarios, considering both constant and diminishing step-sizes and incorporating both i.i.d. and Markovian observation models. Unlike existing methods, the proposed algorithm does not require the assumption that the underlying communication network structure is characterized by a doubly stochastic matrix.

PDF IJCAI Semantic Scholar

Cite

Text

Lim and Lee. "A Primal-Dual Perspective for Distributed TD-Learning." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/634

Markdown

[Lim and Lee. "A Primal-Dual Perspective for Distributed TD-Learning." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/lim2025ijcai-primal/) doi:10.24963/IJCAI.2025/634

BibTeX

@inproceedings{lim2025ijcai-primal,
  title     = {{A Primal-Dual Perspective for Distributed TD-Learning}},
  author    = {Lim, Han-Dong and Lee, Donghwan},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {5698-5706},
  doi       = {10.24963/IJCAI.2025/634},
  url       = {https://mlanthology.org/ijcai/2025/lim2025ijcai-primal/}
}