Sketched Newton Value Iteration for Large-Scale Markov Decision Processes

Jinsong Liu, Chenghan Xie, Qi Deng, Dongdong Ge, Yinyu Ye

AAAI 2024 pp. 13936-13944

doi:10.1609/AAAI.V38I12.29301 /aaai/2024/liu2024aaai-sketched/

Abstract

Value Iteration (VI) is one of the most classic algorithms for solving Markov Decision Processes (MDPs), which lays the foundations for various more advanced reinforcement learning algorithms, such as Q-learning. VI may take a large number of iterations to converge as it is a first-order method. In this paper, we introduce the Newton Value Iteration (NVI) algorithm, which eliminates the impact of action space dimension compared to some previous second-order methods. Consequently, NVI can efficiently handle MDPs with large action spaces. Building upon NVI, we propose a novel approach called Sketched Newton Value Iteration (SNVI) to tackle MDPs with both large state and action spaces. SNVI not only inherits the stability and fast convergence advantages of second-order algorithms, but also significantly reduces computational complexity, making it highly scalable. Extensive experiments demonstrate the superiority of our algorithms over traditional VI and previously proposed second-order VI algorithms.

PDF AAAI Semantic Scholar

Cite

Text

Liu et al. "Sketched Newton Value Iteration for Large-Scale Markov Decision Processes." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I12.29301

Markdown

[Liu et al. "Sketched Newton Value Iteration for Large-Scale Markov Decision Processes." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/liu2024aaai-sketched/) doi:10.1609/AAAI.V38I12.29301

BibTeX

@inproceedings{liu2024aaai-sketched,
  title     = {{Sketched Newton Value Iteration for Large-Scale Markov Decision Processes}},
  author    = {Liu, Jinsong and Xie, Chenghan and Deng, Qi and Ge, Dongdong and Ye, Yinyu},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {13936-13944},
  doi       = {10.1609/AAAI.V38I12.29301},
  url       = {https://mlanthology.org/aaai/2024/liu2024aaai-sketched/}
}