Bridging the Performance-Gap Between Target-Free and Target-Based Reinforcement Learning

Vincent, Théo; Tripathi, Yogesh; Faust, Tim; Akgül, Abdullah; Oren, Yaniv; Kandemir, Melih; Peters, Jan; D'Eramo, Carlo

Bridging the Performance-Gap Between Target-Free and Target-Based Reinforcement Learning

Théo Vincent, Yogesh Tripathi, Tim Faust, Abdullah Akgül, Yaniv Oren, Melih Kandemir, Jan Peters, Carlo D'Eramo

ICLR 2026

/iclr/2026/vincent2026iclr-bridging/

Abstract

The use of target networks in deep reinforcement learning is a widely popular solution to mitigate the brittleness of semi-gradient approaches and stabilize learning. However, target networks notoriously require additional memory and delay the propagation of Bellman updates compared to an ideal target-free approach. In this work, we step out of the binary choice between target-free and target-based algorithms. We introduce a new method that uses a copy of the last linear layer of the online network as a target network, while sharing the remaining parameters with the up-to-date online network. This simple modification enables us to keep the target-free's low-memory footprint while leveraging the target-based literature. We find that combining our approach with the concept of iterated $Q$-learning, which consists of learning consecutive Bellman updates in parallel, helps improve the sample-efficiency of target-free approaches. Our proposed method, iterated Shared $Q$-Learning (iS-QL), bridges the performance gap between target-free and target-based approaches across various problems while using a single $Q$-network, thus stepping towards resource-efficient reinforcement learning algorithms.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Vincent et al. "Bridging the Performance-Gap Between Target-Free and Target-Based Reinforcement Learning." International Conference on Learning Representations, 2026.

Markdown

[Vincent et al. "Bridging the Performance-Gap Between Target-Free and Target-Based Reinforcement Learning." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/vincent2026iclr-bridging/)

BibTeX

@inproceedings{vincent2026iclr-bridging,
  title     = {{Bridging the Performance-Gap Between Target-Free and Target-Based Reinforcement Learning}},
  author    = {Vincent, Théo and Tripathi, Yogesh and Faust, Tim and Akgül, Abdullah and Oren, Yaniv and Kandemir, Melih and Peters, Jan and D'Eramo, Carlo},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/vincent2026iclr-bridging/}
}