Offline Quantum Reinforcement Learning in a Conservative Manner
Abstract
Recently, to reap the quantum advantage, empowering reinforcement learning (RL) with quantum computing has attracted much attention, which is dubbed as quantum RL (QRL). However, current QRL algorithms employ an online learning scheme, i.e., the policy that is run on a quantum computer needs to interact with the environment to collect experiences, which could be expensive and dangerous for practical applications. In this paper, we aim to solve this problem in an offline learning manner. To be more specific, we develop the first offline quantum RL (offline QRL) algorithm named CQ2L (Conservative Quantum Q-learning), which learns from offline samples and does not require any interaction with the environment. CQ2L utilizes variational quantum circuits (VQCs), which are improved with data re-uploading and scaling parameters, to represent Q-value functions of agents. To suppress the overestimation of Q-values resulting from offline data, we first employ a double Q-learning framework to reduce the overestimation bias; then a penalty term that encourages generating conservative Q-values is designed. We conduct abundant experiments to demonstrate that the proposed method CQ2L can successfully solve offline QRL tasks that the online counterpart could not.
Cite
Text
Cheng et al. "Offline Quantum Reinforcement Learning in a Conservative Manner." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I6.25872Markdown
[Cheng et al. "Offline Quantum Reinforcement Learning in a Conservative Manner." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/cheng2023aaai-offline/) doi:10.1609/AAAI.V37I6.25872BibTeX
@inproceedings{cheng2023aaai-offline,
title = {{Offline Quantum Reinforcement Learning in a Conservative Manner}},
author = {Cheng, Zhihao and Zhang, Kaining and Shen, Li and Tao, Dacheng},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2023},
pages = {7148-7156},
doi = {10.1609/AAAI.V37I6.25872},
url = {https://mlanthology.org/aaai/2023/cheng2023aaai-offline/}
}