Stationary Deep Reinforcement Learning with Quantum K-Spin Hamiltonian Regularization
Abstract
Instability is a major issue of deep reinforcement learning (DRL) algorithms --- high variance of performance over multiple runs. It is mainly caused by the existence of many local minima and worsened by the multiple fixed points issue of Bellman's equation. As a fix, we propose a quantum K-spin Hamiltonian regularization term (called H-term) to help a policy network converge to a high-quality local minimum. First, we take a quantum perspective by modeling a policy as a K-spin Ising model and employ a Hamiltonian to measure the energy of a policy. Then, we derive a novel Hamiltonian policy gradient theorem and design a generic actor-critic algorithm that utilizes the H-term to regularize the policy network. Finally, the proposed method reduces the variance of cumulative rewards by 65.2% ~ 85.6% on six MuJoCo tasks, compared with existing algorithms over 20 runs.
Cite
Text
Liu et al. "Stationary Deep Reinforcement Learning with Quantum K-Spin Hamiltonian Regularization." ICLR 2023 Workshops: Physics4ML, 2023.Markdown
[Liu et al. "Stationary Deep Reinforcement Learning with Quantum K-Spin Hamiltonian Regularization." ICLR 2023 Workshops: Physics4ML, 2023.](https://mlanthology.org/iclrw/2023/liu2023iclrw-stationary/)BibTeX
@inproceedings{liu2023iclrw-stationary,
title = {{Stationary Deep Reinforcement Learning with Quantum K-Spin Hamiltonian Regularization}},
author = {Liu, Xiao-Yang and Li, Zechu and Wu, Shixun and Wang, Xiaodong},
booktitle = {ICLR 2023 Workshops: Physics4ML},
year = {2023},
url = {https://mlanthology.org/iclrw/2023/liu2023iclrw-stationary/}
}