QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning

Abstract

Deep reinforcement learning continues to show tremendous potential in achieving task-level autonomy, however, its computational and energy demands remain prohibitively high. In this paper, we tackle this problem by applying quantization to reinforcement learning. To that end, we introduce a novel Reinforcement Learning (RL) training paradigm, \textit{ActorQ}, to speed up actor-learner distributed RL training. \textit{ActorQ} leverages 8-bit quantized actors to speed up data collection without affecting learning convergence. Our quantized distributed RL training system, \textit{ActorQ}, demonstrates end-to-end speedups of $>$ 1.5 $\times$ - 2.5 $\times$, and faster convergence over full precision training on a range of tasks (Deepmind Control Suite) and different RL algorithms (D4PG, DQN). Furthermore, we compare the carbon emissions (Kgs of CO2) of \textit{ActorQ} versus standard reinforcement learning on various tasks. Across various settings, we show that \textit{ActorQ} enables more environmentally friendly reinforcement learning by achieving 2.8$\times$ less carbon emission and energy compared to training RL-agents in full-precision. Finally, we demonstrate empirically that aggressively quantized RL-policies (up to 4/5 bits) enable significant speedups on quantization-friendly (supports native quantization) resource-constrained edge devices, without degrading accuracy. We believe that this is the first of many future works on enabling computationally energy-efficient and sustainable reinforcement learning. The source code for QuaRL is available here for the public to use: \url{https://bit.ly/quarl-tmlr}.

Cite

Text

Krishnan et al. "QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning." Transactions on Machine Learning Research, 2022.

Markdown

[Krishnan et al. "QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning." Transactions on Machine Learning Research, 2022.](https://mlanthology.org/tmlr/2022/krishnan2022tmlr-quarl/)

BibTeX

@article{krishnan2022tmlr-quarl,
  title     = {{QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning}},
  author    = {Krishnan, Srivatsan and Lam, Max and Chitlangia, Sharad and Wan, Zishen and Barth-maron, Gabriel and Faust, Aleksandra and Reddi, Vijay Janapa},
  journal   = {Transactions on Machine Learning Research},
  year      = {2022},
  url       = {https://mlanthology.org/tmlr/2022/krishnan2022tmlr-quarl/}
}