Unlocking the Power of Representations in Long-Term Novelty-Based Exploration

Abstract

We introduce Robust Exploration via Clustering-based Online Density Estimation (RECODE), a non-parametric method for novelty-based exploration that estimates visitation counts for clusters of states based on their similarity in a chosen embedding space. By adapting classical clustering to the nonstationary setting of Deep RL, RECODE can efficiently track state visitation counts over thousands of episodes. We further propose a novel generalization of the inverse dynamics loss, which leverages masked transformer architectures for multi-step prediction; which in conjunction with RECODE achieves a new state-of-the-art in a suite of challenging 3D-exploration tasks in DM-HARD-8. RECODE also sets new state-of-the-art in hard exploration Atari games, and is the first agent to reach the end screen in Pitfall!

Cite

Text

Kapturowski et al. "Unlocking the Power of Representations in Long-Term Novelty-Based Exploration." NeurIPS 2023 Workshops: ALOE, 2023.

Markdown

[Kapturowski et al. "Unlocking the Power of Representations in Long-Term Novelty-Based Exploration." NeurIPS 2023 Workshops: ALOE, 2023.](https://mlanthology.org/neuripsw/2023/kapturowski2023neuripsw-unlocking/)

BibTeX

@inproceedings{kapturowski2023neuripsw-unlocking,
  title     = {{Unlocking the Power of Representations in Long-Term Novelty-Based Exploration}},
  author    = {Kapturowski, Steven and Saade, Alaa and Calandriello, Daniele and Blundell, Charles and Sprechmann, Pablo and Sarra, Leopoldo and Groth, Oliver and Valko, Michal and Piot, Bilal},
  booktitle = {NeurIPS 2023 Workshops: ALOE},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/kapturowski2023neuripsw-unlocking/}
}