Diversity Progress for Goal Selection in Discriminability-Motivated RL
Abstract
Non-uniform goal selection has the potential to improve the reinforcement learning (RL) of skills over uniform-random selection. In this paper, we introduce a method for learning a goal-selection policy in intrinsically-motivated goal-conditioned RL: "Diversity Progress" (DP). The learner forms a curriculum based on observed improvement in discriminability over its set of goals. Our proposed method is applicable to the class of discriminability-motivated agents, where the intrinsic reward is computed as a function of the agent's certainty of following the true goal being pursued. This reward can motivate the agent to learn a set of diverse skills without extrinsic rewards. We demonstrate empirically that a DP-motivated agent can learn a set of distinguishable skills faster than previous approaches, and do so without suffering from a collapse of the goal distribution---a known issue with some prior approaches. We end with plans to take this proof-of-concept forward.
Cite
Text
Lintunen et al. "Diversity Progress for Goal Selection in Discriminability-Motivated RL." NeurIPS 2024 Workshops: IMOL, 2024.Markdown
[Lintunen et al. "Diversity Progress for Goal Selection in Discriminability-Motivated RL." NeurIPS 2024 Workshops: IMOL, 2024.](https://mlanthology.org/neuripsw/2024/lintunen2024neuripsw-diversity/)BibTeX
@inproceedings{lintunen2024neuripsw-diversity,
title = {{Diversity Progress for Goal Selection in Discriminability-Motivated RL}},
author = {Lintunen, Erik M. and Ady, Nadia M. and Guckelsberger, Christian},
booktitle = {NeurIPS 2024 Workshops: IMOL},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/lintunen2024neuripsw-diversity/}
}