Deep Residual Reinforcement Learning (Extended Abstract)
Abstract
We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperforms vanilla DDPG in commonly used benchmarks. Moreover, we find the residual algorithm an effective approach to the distribution mismatch problem in model-based planning. Compared with the existing TD(k) method, our residual-based method makes weaker assumptions about the model and yields a greater performance boost.
Cite
Text
Zhang et al. "Deep Residual Reinforcement Learning (Extended Abstract)." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/668Markdown
[Zhang et al. "Deep Residual Reinforcement Learning (Extended Abstract)." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/zhang2021ijcai-deep/) doi:10.24963/IJCAI.2021/668BibTeX
@inproceedings{zhang2021ijcai-deep,
title = {{Deep Residual Reinforcement Learning (Extended Abstract)}},
author = {Zhang, Shangtong and Boehmer, Wendelin and Whiteson, Shimon},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2021},
pages = {4869-4873},
doi = {10.24963/IJCAI.2021/668},
url = {https://mlanthology.org/ijcai/2021/zhang2021ijcai-deep/}
}