Reinforcement Teaching
Abstract
Machine learning algorithms learn to solve a task, but are unable to improve their ability to learn. Meta-learning methods learn about machine learning algorithms and improve them so that they learn more quickly. However, existing meta-learning methods are either hand-crafted to improve one specific component of an algorithm or only work with differentiable algorithms. We develop a unifying meta-learning framework, called \textit{Reinforcement Teaching}, to improve the learning process of \emph{any} algorithm. Under Reinforcement Teaching, a teaching policy is learned, through reinforcement, to improve a student's learning algorithm. To learn an effective teaching policy, we introduce the \textit{parametric-behavior embedder} that learns a representation of the student's learnable parameters from its input/output behavior. We further use \textit{learning progress} to shape the teacher's reward, allowing it to more quickly maximize the student's performance. To demonstrate the generality of Reinforcement Teaching, we conduct experiments in which a teacher learns to significantly improve both reinforcement and supervised learning algorithms. Reinforcement Teaching outperforms previous work using heuristic reward functions and state representations, as well as other parameter representations.
Cite
Text
Muslimani et al. "Reinforcement Teaching." Transactions on Machine Learning Research, 2023.Markdown
[Muslimani et al. "Reinforcement Teaching." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/muslimani2023tmlr-reinforcement/)BibTeX
@article{muslimani2023tmlr-reinforcement,
title = {{Reinforcement Teaching}},
author = {Muslimani, Calarina and Lewandowski, Alex and Schuurmans, Dale and Taylor, Matthew E. and Luo, Jun},
journal = {Transactions on Machine Learning Research},
year = {2023},
url = {https://mlanthology.org/tmlr/2023/muslimani2023tmlr-reinforcement/}
}