Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control
Abstract
In this work, we present the hyperparameter optimization of an online, off-policy reinforcement learning algorithm based on a parallel search. Since this model-free learning algorithm solves the H∞ optimal tracking problem iteratively using ordinary least squares regression, we propose using the condition number of the data matrix as a model-free measure for tuning the hyperparameters. This addition enables automated optimization of the involved hyperparameters. We demonstrate that the condition number is a useful metric for tuning the number of collected samples, sampling interval, and other hyperparameters involved. In addition, we demonstrate a correlation between this condition number and properties of the sum of sinusoids persistent excitation.
Cite
Text
Farahmandi et al. "Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.Markdown
[Farahmandi et al. "Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/farahmandi2023l4dc-hyperparameter/)BibTeX
@inproceedings{farahmandi2023l4dc-hyperparameter,
title = {{Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control}},
author = {Farahmandi, Alireza and Reitz, Brian C and Debord, Mark and Philbrick, Douglas and Estabridis, Katia and Hewer, Gary},
booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
year = {2023},
pages = {1455-1466},
volume = {211},
url = {https://mlanthology.org/l4dc/2023/farahmandi2023l4dc-hyperparameter/}
}