Multi-Constraint Deep Reinforcement Learning for Smooth Action Control
Abstract
Deep reinforcement learning (DRL) has been studied in a variety of challenging decision-making tasks, e.g., autonomous driving. \textcolor{black}However, DRL typically suffers from the action shaking problem, which means that agents can select actions with big difference even though states only slightly differ. One of the crucial reasons for this issue is the inappropriate design of the reward in DRL. In this paper, to address this issue, we propose a novel way to incorporate the smoothness of actions in the reward. Specifically, we introduce sub-rewards and add multiple constraints related to these sub-rewards. In addition, we propose a multi-constraint proximal policy optimization (MCPPO) method to solve the multi-constraint DRL problem. Extensive simulation results show that the proposed MCPPO method has better action smoothness compared with the traditional proportional-integral-differential (PID) and mainstream DRL algorithms. The video is available at https://youtu.be/F2jpaSm7YOg.
Cite
Text
Zou et al. "Multi-Constraint Deep Reinforcement Learning for Smooth Action Control." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/528Markdown
[Zou et al. "Multi-Constraint Deep Reinforcement Learning for Smooth Action Control." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/zou2022ijcai-multi/) doi:10.24963/IJCAI.2022/528BibTeX
@inproceedings{zou2022ijcai-multi,
title = {{Multi-Constraint Deep Reinforcement Learning for Smooth Action Control}},
author = {Zou, Guangyuan and He, Ying and Yu, F. Richard and Chen, Longquan and Pan, Weike and Ming, Zhong},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2022},
pages = {3802-3808},
doi = {10.24963/IJCAI.2022/528},
url = {https://mlanthology.org/ijcai/2022/zou2022ijcai-multi/}
}