Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning

Abstract

Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes are not always the best choice to model complex dynamical systems requiring interactions at high frequency. Being able to work with arbitrary time intervals, Continuous Time Reinforcement Learning (CTRL) is more suitable for those problems. Instead of the Bellman equation operating in discrete time, it is the Hamilton-Jacobi-Bellman (HJB) equation that describes value function evolution in CTRL. Even though the value function is a solution of the HJB equation, it may not be its unique solution. To distinguish the value function from other solutions, it is important to look for the viscosity solutions of the HJB equation. The viscosity solutions constitute a special class of solutions that possess uniqueness and stability properties. This paper proposes a novel approach to approximate the value function by training a physics informed neural network (PINN) through a specific *$\epsilon$-scheduling* iterative process constraining the PINN to converge towards the viscosity solution and shows experimental results with classical control tasks, where PINNs outperform popular RL algorithms in a nearly continuous-time setting.

Cite

Text

Shilova et al. "Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning." ICML 2024 Workshops: RLControlTheory, 2024.

Markdown

[Shilova et al. "Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning." ICML 2024 Workshops: RLControlTheory, 2024.](https://mlanthology.org/icmlw/2024/shilova2024icmlw-learning/)

BibTeX

@inproceedings{shilova2024icmlw-learning,
  title     = {{Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning}},
  author    = {Shilova, Alena and Delliaux, Thomas and Preux, Philippe and Raffin, Bruno},
  booktitle = {ICML 2024 Workshops: RLControlTheory},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/shilova2024icmlw-learning/}
}