See, Hear, Explore: Curiosity via Audio-Visual Association

Abstract

Exploration is one of the core challenges in reinforcement learning. A common formulation of curiosity-driven exploration uses the difference between the real future and the future predicted by a learned model. However, predicting the future is an inherently difficult task which can be ill-posed in the face of stochasticity. In this paper, we introduce an alternative form of curiosity that rewards novel associations between different senses. Our approach exploits multiple modalities to provide a stronger signal for more efficient exploration. Our method is inspired by the fact that, for humans, both sight and sound play a critical role in exploration. We present results on several Atari environments and Habitat (a photorealistic navigation simulator), showing the benefits of using an audio-visual association model for intrinsically guiding learning agents in the absence of external rewards. For videos and code, see https://vdean.github.io/audio-curiosity.html.

Cite

Text

Dean et al. "See, Hear, Explore: Curiosity via Audio-Visual Association." Neural Information Processing Systems, 2020.

Markdown

[Dean et al. "See, Hear, Explore: Curiosity via Audio-Visual Association." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/dean2020neurips-see/)

BibTeX

@inproceedings{dean2020neurips-see,
  title     = {{See, Hear, Explore: Curiosity via Audio-Visual Association}},
  author    = {Dean, Victoria and Tulsiani, Shubham and Gupta, Abhinav},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/dean2020neurips-see/}
}