Trust the Model When It Is Confident: Masked Model-Based Actor-Critic
Abstract
It is a popular belief that model-based Reinforcement Learning (RL) is more sample efficient than model-free RL, but in practice, it is not always true due to overweighed model errors. In complex and noisy settings, model-based RL tends to have trouble using the model if it does not know when to trust the model.
Cite
Text
Pan et al. "Trust the Model When It Is Confident: Masked Model-Based Actor-Critic." Neural Information Processing Systems, 2020.Markdown
[Pan et al. "Trust the Model When It Is Confident: Masked Model-Based Actor-Critic." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/pan2020neurips-trust/)BibTeX
@inproceedings{pan2020neurips-trust,
title = {{Trust the Model When It Is Confident: Masked Model-Based Actor-Critic}},
author = {Pan, Feiyang and He, Jia and Tu, Dandan and He, Qing},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/pan2020neurips-trust/}
}