Continuous-Time Reward Machines

Falah, Amin; Guha, Shibashis; Trivedi, Ashutosh

doi:10.24963/IJCAI.2025/563

Continuous-Time Reward Machines

Amin Falah, Shibashis Guha, Ashutosh Trivedi

IJCAI 2025 pp. 5056-5064

doi:10.24963/IJCAI.2025/563 /ijcai/2025/falah2025ijcai-continuous/

Abstract

Reinforcement Learning (RL) is a sampling-based method for sequential decision-making, in which a learning agent iteratively converges toward an optimal policy by leveraging feedback from the environment in the form of scalar reward signals. While timing information is often abstracted in discrete-time domains, time-critical learning applications—such as queuing systems, population processes, and manufacturing systems—are naturally modeled as Continuous-Time Markov Decision Processes (CTMDPs). Since the seminal work of Bradtke and Duff, model-free RL for CTMDPs has become well-understood. However, in many practical applications, practitioners possess high-quality information about system rates derived from traditional queuing theory, which learning agents could potentially exploit to accelerate convergence. Despite this, classical RL algorithms for CTMDPs typically re-learn these parameters through sampling. In this work, we propose continuous-time reward machines (CTRMs), a novel framework that embeds reward functions and real-time state-action dynamics into a unified structure. CTRMs enable RL agents to effectively navigate dense-time environments while leveraging reward shaping and counterfactual experiences for accelerated learning. Our empirical results demonstrate CTRMs' ability to improve learning efficiency in time-critical environments.

PDF IJCAI Semantic Scholar

Cite

Text

Falah et al. "Continuous-Time Reward Machines." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/563

Markdown

[Falah et al. "Continuous-Time Reward Machines." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/falah2025ijcai-continuous/) doi:10.24963/IJCAI.2025/563

BibTeX

@inproceedings{falah2025ijcai-continuous,
  title     = {{Continuous-Time Reward Machines}},
  author    = {Falah, Amin and Guha, Shibashis and Trivedi, Ashutosh},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {5056-5064},
  doi       = {10.24963/IJCAI.2025/563},
  url       = {https://mlanthology.org/ijcai/2025/falah2025ijcai-continuous/}
}