TGRL: An Algorithm for Teacher Guided Reinforcement Learning
Abstract
We consider solving sequential decision-making problems in the scenario where the agent has access to two supervision sources: $\textit{reward signal}$ and a $\textit{teacher}$ that can be queried to obtain a $\textit{good}$ action for any state encountered by the agent. Learning solely from rewards, or reinforcement learning, is data inefficient and may not learn high-reward policies in challenging scenarios involving sparse rewards or partial observability. On the other hand, learning from a teacher may sometimes be infeasible. For instance, the actions provided by a teacher with privileged information may be unlearnable by an agent with limited information (i.e., partial observability). In other scenarios, the teacher might be sub-optimal, and imitating their actions can limit the agent’s performance. To overcome these challenges, prior work proposed to jointly optimize imitation and reinforcement learning objectives but relied on heuristics and problem-specific hyper-parameter tuning to balance the two objectives. We introduce Teacher Guided Reinforcement Learning (TGRL), a principled approach to dynamically balance following the teacher’s guidance and leveraging RL. TGRL outperforms strong baselines across diverse domains without hyperparameter tuning.
Cite
Text
Shenfeld et al. "TGRL: An Algorithm for Teacher Guided Reinforcement Learning." International Conference on Machine Learning, 2023.Markdown
[Shenfeld et al. "TGRL: An Algorithm for Teacher Guided Reinforcement Learning." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/shenfeld2023icml-tgrl/)BibTeX
@inproceedings{shenfeld2023icml-tgrl,
title = {{TGRL: An Algorithm for Teacher Guided Reinforcement Learning}},
author = {Shenfeld, Idan and Hong, Zhang-Wei and Tamar, Aviv and Agrawal, Pulkit},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {31077-31093},
volume = {202},
url = {https://mlanthology.org/icml/2023/shenfeld2023icml-tgrl/}
}