Robust Knowledge Transfer in Tiered Reinforcement Learning
Abstract
In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions, and focus on robust knowledge transfer without prior knowledge on the task similarity. We identify a natural and necessary condition called the ``Optimal Value Dominance'' for our objective. Under this condition, we propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity and retain near-optimal regret when the two tasks are dissimilar, while for the low-tier task, it can keep near-optimal without making sacrifice. Moreover, we further study the setting with multiple low-tier tasks, and propose a novel transfer source selection mechanism, which can ensemble the information from all low-tier tasks and allow provable benefits on a much larger state-action space.
Cite
Text
Huang and He. "Robust Knowledge Transfer in Tiered Reinforcement Learning." Neural Information Processing Systems, 2023.Markdown
[Huang and He. "Robust Knowledge Transfer in Tiered Reinforcement Learning." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/huang2023neurips-robust-a/)BibTeX
@inproceedings{huang2023neurips-robust-a,
title = {{Robust Knowledge Transfer in Tiered Reinforcement Learning}},
author = {Huang, Jiawei and He, Niao},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/huang2023neurips-robust-a/}
}