ADDQ: Adaptive Distributional Double Q-Learning
Abstract
Bias problems in the estimation of Q-values are a well-known obstacle that slows down convergence of Q-learning and actor-critic methods. One of the reasons of the success of modern RL algorithms is partially a direct or indirect overestimation reduction mechanism. We introduce an easy to implement method built on top of distributional reinforcement learning (DRL) algorithms to deal with the overestimation in a locally adaptive way. Our framework ADDQ is simple to implement, existing DRL implementations can be improved with a few lines of code. We provide theoretical backup and experimental results in tabular, Atari, and MuJoCo environments, comparisons with state-of-the-art methods, and a proof of convergence in the tabular case.
Cite
Text
Döring et al. "ADDQ: Adaptive Distributional Double Q-Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Döring et al. "ADDQ: Adaptive Distributional Double Q-Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/doring2025icml-addq/)BibTeX
@inproceedings{doring2025icml-addq,
title = {{ADDQ: Adaptive Distributional Double Q-Learning}},
author = {Döring, Leif and Wille, Benedikt and Birr, Maximilian and Bı̂rsan, Mihail and Slowik, Martin},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {14344-14390},
volume = {267},
url = {https://mlanthology.org/icml/2025/doring2025icml-addq/}
}