Learning Robust Representation for Reinforcement Learning with Distractions by Reward Sequence Prediction
Abstract
Reinforcement learning algorithms have achieved remarkable success in acquiring behavioral skills directly from pixel inputs. However, their application in real-world scenarios presents challenges due to their sensitivity to visual distractions (e.g., changes in viewpoint and light). A key factor contributing to this challenge is that the learned representations often suffer from overfitting task-irrelevant information. By comparing several representation learning methods, we find that the key to alleviating overfitting in representation learning is to choose proper prediction targets. Motivated by our comparison, we propose a novel representation learning approach—namely, reward sequence prediction (RSP)—that uses reward sequences or their transforms (e.g., discrete time Fourier transform) as prediction targets. RSP can efficiently learn robust representations as reward sequences rarely contain task-irrelevant information while providing a large number of supervised signals to accelerate representation learning. An appealing feature is that RSP makes no assumption about the type of distractions and thus can improve performance even when multiple types of distractions exist. We evaluate our approach in Distracting Control Suite. Experiments show that our method achieves state-of-the-art sample efficiency and generalization ability in tasks with distractions.
Cite
Text
Zhou et al. "Learning Robust Representation for Reinforcement Learning with Distractions by Reward Sequence Prediction." Uncertainty in Artificial Intelligence, 2023.Markdown
[Zhou et al. "Learning Robust Representation for Reinforcement Learning with Distractions by Reward Sequence Prediction." Uncertainty in Artificial Intelligence, 2023.](https://mlanthology.org/uai/2023/zhou2023uai-learning/)BibTeX
@inproceedings{zhou2023uai-learning,
title = {{Learning Robust Representation for Reinforcement Learning with Distractions by Reward Sequence Prediction}},
author = {Zhou, Qi and Wang, Jie and Liu, Qiyuan and Kuang, Yufei and Zhou, Wengang and Li, Houqiang},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2023},
pages = {2551-2562},
volume = {216},
url = {https://mlanthology.org/uai/2023/zhou2023uai-learning/}
}