Quantum Policy Iteration via Amplitude Estimation and Grover Search – Towards Quantum Advantage for Reinforcement Learning

Abstract

We present a full implementation and simulation of a novel quantum reinforcement learning method. Our work is a detailed and formal proof of concept for how quantum algorithms can be used to solve reinforcement learning problems and shows that, given access to error- free, efficient quantum realizations of the agent and environment, quantum methods can yield provable improvements over classical Monte-Carlo based methods in terms of sample complexity. Our approach shows in detail how to combine amplitude estimation and Grover search into a policy evaluation and improvement scheme. We first develop quantum policy evaluation (QPE) which is quadratically more efficient compared to an analogous classi- cal Monte Carlo estimation and is based on a quantum mechanical realization of a finite Markov decision process (MDP). Building on QPE, we derive a quantum policy iteration that repeatedly improves an initial policy using Grover search until the optimum is reached. Finally, we present an implementation of our algorithm for a two-armed bandit MDP which we then simulate.

Cite

Text

Wiedemann et al. "Quantum Policy Iteration via Amplitude Estimation and Grover Search – Towards Quantum Advantage for Reinforcement Learning." Transactions on Machine Learning Research, 2023.

Markdown

[Wiedemann et al. "Quantum Policy Iteration via Amplitude Estimation and Grover Search – Towards Quantum Advantage for Reinforcement Learning." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/wiedemann2023tmlr-quantum/)

BibTeX

@article{wiedemann2023tmlr-quantum,
  title     = {{Quantum Policy Iteration via Amplitude Estimation and Grover Search – Towards Quantum Advantage for Reinforcement Learning}},
  author    = {Wiedemann, Simon and Hein, Daniel and Udluft, Steffen and Mendl, Christian B.},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/wiedemann2023tmlr-quantum/}
}