Skill Disentanglement in Reproducing Kernel Hilbert Space

Abstract

Unsupervised Skill Discovery aims to learn diverse skills without extrinsic rewards, using them as priors for downstream tasks. Existing methods focus on empowerment or entropy maximization but often result in static or non-discriminable skills. Instead, our method, Hilbert Unsupervised Skill Discovery (HUSD), combines $f$-divergence with Integral Probability Metrics to promote behavioral diversity and disentanglement. HUSD maximizes the Maximum Mean Discrepancy between the joint distribution of skills and states and their marginals in Reproducing Kernel Hilbert Space, leading to better exploration and skill separability. Our results on Unsupervised RL Benchmarks show HUSD outperforms previous exploration algorithms on state-based tasks.

Cite

Text

Dave and Rueckert. "Skill Disentanglement in Reproducing Kernel Hilbert Space." NeurIPS 2024 Workshops: IMOL, 2024.

Markdown

[Dave and Rueckert. "Skill Disentanglement in Reproducing Kernel Hilbert Space." NeurIPS 2024 Workshops: IMOL, 2024.](https://mlanthology.org/neuripsw/2024/dave2024neuripsw-skill/)

BibTeX

@inproceedings{dave2024neuripsw-skill,
  title     = {{Skill Disentanglement in Reproducing Kernel Hilbert Space}},
  author    = {Dave, Vedant and Rueckert, Elmar},
  booktitle = {NeurIPS 2024 Workshops: IMOL},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/dave2024neuripsw-skill/}
}