Visual Reinforcement Learning with Residual Action

Abstract

Learning control policy from continuous action space by visual observations is a fundamental and challenging task in reinforcement learning (RL). An essential problem is how to accurately map the high-dimensional images to the optimal actions by the policy network. Traditional decision-making modules output actions solely based on the current observation, while the distributions of optimal actions are dependent on specific tasks and cannot be known priorly, which increases the learning difficulty. To make the learning easier, we analyze the action characteristics in several control tasks, and propose Reinforcement Learning with Residual Action (ResAct) to explicitly model the adjustments of actions based on the differences between adjacent observations, rather than learning actions directly from observations. The method just redefines the output of the policy network, and doesn’t introduce any prior assumption to constrain or simplify the vanilla control problem. Extensive experiments on DeepMind Control Suite and CARLA demonstrate that the method could improve different RL baselines significantly, and achieve state-of-the-art performance.

Cite

Text

Liu et al. "Visual Reinforcement Learning with Residual Action." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I18.34097

Markdown

[Liu et al. "Visual Reinforcement Learning with Residual Action." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/liu2025aaai-visual/) doi:10.1609/AAAI.V39I18.34097

BibTeX

@inproceedings{liu2025aaai-visual,
  title     = {{Visual Reinforcement Learning with Residual Action}},
  author    = {Liu, Zhenxian and Peng, Peixi and Tian, Yonghong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {19050-19058},
  doi       = {10.1609/AAAI.V39I18.34097},
  url       = {https://mlanthology.org/aaai/2025/liu2025aaai-visual/}
}