Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls

Abstract

In this paper, we propose Q-learning algorithms for continuous-time deterministic optimal control problems with Lipschitz continuous controls. A new class of Hamilton-Jacobi-Bellman (HJB) equations is derived from applying the dynamic programming principle to continuous-time Q-functions. Our method is based on a novel semi-discrete version of the HJB equation, which is proposed to design a Q-learning algorithm that uses data collected in discrete time without discretizing or approximating the system dynamics. We identify the conditions under which the Q-function estimated by this algorithm converges to the optimal Q-function. For practical implementation, we propose the Hamilton-Jacobi DQN, which extends the idea of deep Q-networks (DQN) to our continuous control setting. This approach does not require actor networks or numerical solutions to optimization problems for greedy actions since the HJB equation provides a simple characterization of optimal controls via ordinary differential equations. We empirically demonstrate the performance of our method through benchmark tasks and high-dimensional linear-quadratic problems.

Cite

Text

Kim et al. "Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls." Journal of Machine Learning Research, 2021.

Markdown

[Kim et al. "Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls." Journal of Machine Learning Research, 2021.](https://mlanthology.org/jmlr/2021/kim2021jmlr-hamiltonjacobi/)

BibTeX

@article{kim2021jmlr-hamiltonjacobi,
  title     = {{Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls}},
  author    = {Kim, Jeongho and Shin, Jaeuk and Yang, Insoon},
  journal   = {Journal of Machine Learning Research},
  year      = {2021},
  pages     = {1-34},
  volume    = {22},
  url       = {https://mlanthology.org/jmlr/2021/kim2021jmlr-hamiltonjacobi/}
}