ADDQ: Adaptive Distributional Double Q-Learning

Abstract

Bias problems in the estimation of Q-values are a well-known obstacle that slows down convergence of Q-learning and actor-critic methods. One of the reasons of the success of modern RL algorithms is partially a direct or indirect overestimation reduction mechanism. We introduce an easy to implement method built on top of distributional reinforcement learning (DRL) algorithms to deal with the overestimation in a locally adaptive way. Our framework ADDQ is simple to implement, existing DRL implementations can be improved with a few lines of code. We provide theoretical backup and experimental results in tabular, Atari, and MuJoCo environments, comparisons with state-of-the-art methods, and a proof of convergence in the tabular case.

Cite

Text

Döring et al. "ADDQ: Adaptive Distributional Double Q-Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Döring et al. "ADDQ: Adaptive Distributional Double Q-Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/doring2025icml-addq/)

BibTeX

@inproceedings{doring2025icml-addq,
  title     = {{ADDQ: Adaptive Distributional Double Q-Learning}},
  author    = {Döring, Leif and Wille, Benedikt and Birr, Maximilian and Bı̂rsan, Mihail and Slowik, Martin},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {14344-14390},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/doring2025icml-addq/}
}