Two-Level Actor-Critic Using Multiple Teachers

Abstract

Deep reinforcement learning has successfully allowed agents to learn complex behaviors for many tasks. However, a key limitation of current learning approaches is the sample-inefficiency problem, which limits performance of the learning agent. This paper considers how agents can benefit from improved learning via teachers' advice. In particular, we consider the setting with multiple sub-optimal teachers, as opposed to having a single near-optimal teacher. We propose a flexible two-level actor-critic algorithm where the high-level network learns to choose the best teacher in the current situation while the low-level network learns the control policy.

Cite

Text

Zhang et al. "Two-Level Actor-Critic Using Multiple Teachers." Transactions on Machine Learning Research, 2023.

Markdown

[Zhang et al. "Two-Level Actor-Critic Using Multiple Teachers." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/zhang2023tmlr-twolevel/)

BibTeX

@article{zhang2023tmlr-twolevel,
  title     = {{Two-Level Actor-Critic Using Multiple Teachers}},
  author    = {Zhang, Su and Das, Srijita and Subramanian, Sriram Ganapathi and Taylor, Matthew E.},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/zhang2023tmlr-twolevel/}
}