BYOL-Explore: Exploration by Bootstrapped Prediction
Abstract
We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually complex environments. BYOL-Explore learns the world representation, the world dynamics and the exploration policy all-together by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually rich 3-D environment. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive agents.
Cite
Text
Guo et al. "BYOL-Explore: Exploration by Bootstrapped Prediction." Neural Information Processing Systems, 2022.Markdown
[Guo et al. "BYOL-Explore: Exploration by Bootstrapped Prediction." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/guo2022neurips-byolexplore/)BibTeX
@inproceedings{guo2022neurips-byolexplore,
title = {{BYOL-Explore: Exploration by Bootstrapped Prediction}},
author = {Guo, Zhaohan and Thakoor, Shantanu and Pislar, Miruna and Pires, Bernardo Avila and Altché, Florent and Tallec, Corentin and Saade, Alaa and Calandriello, Daniele and Grill, Jean-Bastien and Tang, Yunhao and Valko, Michal and Munos, Remi and Azar, Mohammad Gheshlaghi and Piot, Bilal},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/guo2022neurips-byolexplore/}
}