Global Optimality of Single-Timescale Actor-Critic Under Continuous State-Action Space: A Study on Linear Quadratic Regulator

Chen, Xuyang; Duan, Jingliang; Zhao, Lin

doi:10.24963/ijcai.2024/422

Global Optimality of Single-Timescale Actor-Critic Under Continuous State-Action Space: A Study on Linear Quadratic Regulator

Xuyang Chen, Jingliang Duan, Lin Zhao

IJCAI 2024 pp. 3816-3824

doi:10.24963/ijcai.2024/422 /ijcai/2024/chen2024ijcai-global/

Abstract

Coalition formation involves self-organized coalitions generated through strategic interactions of autonomous selfish agents. In online learning of coalition structures, agents' preferences toward each other are initially unknown before agents interact. Coalitions are formed iteratively based on preferences that agents learn online from repeated feedback resulting from their interactions. In this paper, we introduce online learning in coalition formation through the lens of distributed decision-making, where self-interested agents operate without global coordination or information sharing, and learn only from their own experience. Under our selfish perspective, each agent seeks to maximize her own utility. Thus, we analyze the system in terms of Nash stability, where no agent can improve her utility by unilaterally deviating. We devise a sample-efficient decentralized algorithm for selfish agents that minimize their Nash regret, yielding approximately Nash stable solutions. In our algorithm, each agent uses only one utility feedback per round to update her strategy, but our algorithm still has Nash regret and sample complexity bounds that are optimal up to logarithmic factors.

PDF IJCAI Semantic Scholar

Cite

Text

Chen et al. "Global Optimality of Single-Timescale Actor-Critic Under Continuous State-Action Space: A Study on Linear Quadratic Regulator." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/422

Markdown

[Chen et al. "Global Optimality of Single-Timescale Actor-Critic Under Continuous State-Action Space: A Study on Linear Quadratic Regulator." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/chen2024ijcai-global/) doi:10.24963/ijcai.2024/422

BibTeX

@inproceedings{chen2024ijcai-global,
  title     = {{Global Optimality of Single-Timescale Actor-Critic Under Continuous State-Action Space: A Study on Linear Quadratic Regulator}},
  author    = {Chen, Xuyang and Duan, Jingliang and Zhao, Lin},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {3816-3824},
  doi       = {10.24963/ijcai.2024/422},
  url       = {https://mlanthology.org/ijcai/2024/chen2024ijcai-global/}
}