QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning
Abstract
Deep reinforcement learning continues to show tremendous potential in achieving task-level autonomy, however, its computational and energy demands remain prohibitively high. In this paper, we tackle this problem by applying quantization to reinforcement learning. To that end, we introduce a novel Reinforcement Learning (RL) training paradigm, \textit{ActorQ}, to speed up actor-learner distributed RL training. \textit{ActorQ} leverages 8-bit quantized actors to speed up data collection without affecting learning convergence. Our quantized distributed RL training system, \textit{ActorQ}, demonstrates end-to-end speedups of $>$ 1.5 $\times$ - 2.5 $\times$, and faster convergence over full precision training on a range of tasks (Deepmind Control Suite) and different RL algorithms (D4PG, DQN). Furthermore, we compare the carbon emissions (Kgs of CO2) of \textit{ActorQ} versus standard reinforcement learning on various tasks. Across various settings, we show that \textit{ActorQ} enables more environmentally friendly reinforcement learning by achieving 2.8$\times$ less carbon emission and energy compared to training RL-agents in full-precision. Finally, we demonstrate empirically that aggressively quantized RL-policies (up to 4/5 bits) enable significant speedups on quantization-friendly (supports native quantization) resource-constrained edge devices, without degrading accuracy. We believe that this is the first of many future works on enabling computationally energy-efficient and sustainable reinforcement learning. The source code for QuaRL is available here for the public to use: \url{https://bit.ly/quarl-tmlr}.
Cite
Text
Krishnan et al. "QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning." Transactions on Machine Learning Research, 2022.Markdown
[Krishnan et al. "QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning." Transactions on Machine Learning Research, 2022.](https://mlanthology.org/tmlr/2022/krishnan2022tmlr-quarl/)BibTeX
@article{krishnan2022tmlr-quarl,
title = {{QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning}},
author = {Krishnan, Srivatsan and Lam, Max and Chitlangia, Sharad and Wan, Zishen and Barth-maron, Gabriel and Faust, Aleksandra and Reddi, Vijay Janapa},
journal = {Transactions on Machine Learning Research},
year = {2022},
url = {https://mlanthology.org/tmlr/2022/krishnan2022tmlr-quarl/}
}