Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control

Abstract

In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.

Cite

Text

Farahmandi et al. "Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.

Markdown

[Farahmandi et al. "Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/farahmandi2023l4dc-hyperparameter/)

BibTeX

@inproceedings{farahmandi2023l4dc-hyperparameter,
  title     = {{Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control}},
  author    = {Farahmandi, Alireza and Reitz, Brian C and Debord, Mark and Philbrick, Douglas and Estabridis, Katia and Hewer, Gary},
  booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  year      = {2023},
  pages     = {1455-1466},
  volume    = {211},
  url       = {https://mlanthology.org/l4dc/2023/farahmandi2023l4dc-hyperparameter/}
}