Solving Long-Run Average Reward Robust MDPs via Stochastic Games

Chatterjee, Krishnendu; Goharshady, Ehsan Kafshdar; Karrabi, Mehrdad; Novotný, Petr; Zikelic, Dorde

doi:10.24963/ijcai.2024/741

Solving Long-Run Average Reward Robust MDPs via Stochastic Games

Krishnendu Chatterjee, Ehsan Kafshdar Goharshady, Mehrdad Karrabi, Petr Novotný, Dorde Zikelic

IJCAI 2024 pp. 6707-6715

doi:10.24963/ijcai.2024/741 /ijcai/2024/chatterjee2024ijcai-solving/

Abstract

Hierarchical optimization is attracting significant attentions as it can be applied to a broad range of machine learning tasks. Recently, many algorithms are proposed to improve the theoretical results of minimax and bilevel optimizations. Among these works, a core issue that has not been well studies is to escape saddle point and find local minimum. In this paper, thus, we investigate the methods to achieve second-order optimality for nonconvex minimax and bilevel optimization. Specifically, we propose a new algorithm named PRGDA without the computation of second order derivative of the primal function. In nonconvex-strongly-concave minimax optimization, we prove that our algorithm can find a second-order stationary point with the gradient complexity that matches state-of-the-art result to find first-order stationary point. To our best knowledge, PRGDA is the first stochastic algorithm that is guaranteed to obtain the second-order stationary point for nonconvex minimax problems. In nonconvex-strongly-convex bilevel optimization, our method also achieves better gradient complexity to find local minimum. Finally, we conduct two numerical experiments to validate the performance of our new method.

PDF IJCAI Semantic Scholar

Cite

Text

Chatterjee et al. "Solving Long-Run Average Reward Robust MDPs via Stochastic Games." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/741

Markdown

[Chatterjee et al. "Solving Long-Run Average Reward Robust MDPs via Stochastic Games." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/chatterjee2024ijcai-solving/) doi:10.24963/ijcai.2024/741

BibTeX

@inproceedings{chatterjee2024ijcai-solving,
  title     = {{Solving Long-Run Average Reward Robust MDPs via Stochastic Games}},
  author    = {Chatterjee, Krishnendu and Goharshady, Ehsan Kafshdar and Karrabi, Mehrdad and Novotný, Petr and Zikelic, Dorde},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {6707-6715},
  doi       = {10.24963/ijcai.2024/741},
  url       = {https://mlanthology.org/ijcai/2024/chatterjee2024ijcai-solving/}
}