Deep Recurrent Optimal Stopping

Abstract

Deep neural networks (DNNs) have recently emerged as a powerful paradigm for solving Markovian optimal stopping problems. However, a ready extension of DNN-based methods to non-Markovian settings requires significant state and parameter space expansion, manifesting the curse of dimensionality. Further, efficient state-space transformations permitting Markovian approximations, such as those afforded by recurrent neural networks (RNNs), are either structurally infeasible or are confounded by the curse of non-Markovianity. Considering these issues, we introduce, for the first time, an optimal stopping policy gradient algorithm (OSPG) that can leverage RNNs effectively in non-Markovian settings by implicitly optimizing value functions without recursion, mitigating the curse of non-Markovianity. The OSPG algorithm is derived from an inference procedure on a novel Bayesian network representation of discrete-time non-Markovian optimal stopping trajectories and, as a consequence, yields an offline policy gradient algorithm that eliminates expensive Monte Carlo policy rollouts.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Venkata and Bhattacharyya. "Deep Recurrent Optimal Stopping." Neural Information Processing Systems, 2023.

Markdown

[Venkata and Bhattacharyya. "Deep Recurrent Optimal Stopping." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/venkata2023neurips-deep/)

BibTeX

@inproceedings{venkata2023neurips-deep,
  title     = {{Deep Recurrent Optimal Stopping}},
  author    = {Venkata, Niranjan Damera and Bhattacharyya, Chiranjib},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/venkata2023neurips-deep/}
}