Integral Performance Approximation for Continuous-Time Reinforcement Learning Control

Abstract

We introduce integral performance approximation (IPA), a new continuous-time reinforcement learning (CT-RL) control method. It leverages an affine nonlinear dynamic model, which partially captures the dynamics of the physical environment, alongside state-action trajectory data to enable optimal control with great data efficiency and robust control performance. Utilizing Kleinman algorithm structures allows IPA to provide theoretical guarantees of learning convergence, solution optimality, and closed-loop stability. Furthermore, we demonstrate the effectiveness of IPA on three CT-RL environments including hypersonic vehicle (HSV) control, which has additional challenges caused by unstable and nonminimum phase dynamics. As a result, we demonstrate that the IPA method leads to new, SOTA control design and performance in CT-RL.

Cite

Text

Wallace and Si. "Integral Performance Approximation for Continuous-Time Reinforcement Learning Control." International Conference on Learning Representations, 2025.

Markdown

[Wallace and Si. "Integral Performance Approximation for Continuous-Time Reinforcement Learning Control." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/wallace2025iclr-integral/)

BibTeX

@inproceedings{wallace2025iclr-integral,
  title     = {{Integral Performance Approximation for Continuous-Time Reinforcement Learning Control}},
  author    = {Wallace, Brent A. and Si, Jennie},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/wallace2025iclr-integral/}
}