Quantum Algorithms for Finite-Horizon Markov Decision Processes

Abstract

In this work, we design quantum algorithms that are more efficient than classical algorithms to solve time-dependent and finite-horizon Markov Decision Processes (MDPs) in two distinct settings: (1) In the exact dynamics setting, where the agent has full knowledge of the environment’s dynamics (i.e., transition probabilities), we prove that our Quantum Value Iteration (QVI) algorithm QVI-1 achieves a quadratic speedup in the size of the action space $(A)$ compared with the classical value iteration algorithm for computing the optimal policy ($\pi^{\ast}$) and the optimal V-value function ($V_{0}^{\ast}$). Furthermore, our algorithm QVI-2 provides an additional speedup in the size of the state space $(S)$ when obtaining near-optimal policies and V-value functions. Both QVI-1 and QVI-2 achieve quantum query complexities that provably improve upon classical lower bounds, particularly in their dependences on $S$ and $A$. (2) In the generative model setting, where samples from the environment are accessible in quantum superposition, we prove that our algorithms QVI-3 and QVI-4 achieve improvements in sample complexity over the state-of-the-art (SOTA) classical algorithm in terms of $A$, estimation error $(\epsilon)$, and time horizon $(H)$. More importantly, we prove quantum lower bounds to show that QVI-3 and QVI-4 are asymptotically optimal, up to logarithmic factors, assuming a constant time horizon.

Cite

Text

Luo et al. "Quantum Algorithms for Finite-Horizon Markov Decision Processes." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Luo et al. "Quantum Algorithms for Finite-Horizon Markov Decision Processes." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/luo2025icml-quantum/)

BibTeX

@inproceedings{luo2025icml-quantum,
  title     = {{Quantum Algorithms for Finite-Horizon Markov Decision Processes}},
  author    = {Luo, Bin and Huang, Yuwen and Allcock, Jonathan and Lin, Xiaojun and Zhang, Shengyu and Lui, John C.S.},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {41200-41234},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/luo2025icml-quantum/}
}