Masked Skill Token Training for Hierarchical Off-Dynamics Transfer

Abstract

Generalizing policies across environments with altered dynamics remains a key challenge in reinforcement learning, particularly in offline settings where direct interaction or fine-tuning is impractical. We introduce Masked Skill Token Training (MSTT), a fully offline hierarchical RL framework that enables policy transfer using observation-only demonstrations. MSTT constructs a discrete skill space via unsupervised trajectory tokenization and trains a skill-conditioned value function using masked Bellman updates, which simulate dynamics shifts by selectively disabling skills. A diffusion-based trajectory generator, paired with feasibility-based filtering, enables the agent to execute valid, temporally extended actions without requiring action labels or access to the target environment. Our results in both discrete and continuous domains demonstrate the potential of mask-guided planning for robust generalization under dynamics shifts. To our knowledge, MSTT is the first work to explore masking as a mechanism for simulating and generalizing across off-dynamics environments. It marks a promising step toward scalable, structure-aware transfer and opens avenues to explore multi-goal conditioning, and extensions to more complex, real-world scenarios.

Cite

Text

Feng et al. "Masked Skill Token Training for Hierarchical Off-Dynamics Transfer." International Conference on Learning Representations, 2026.

Markdown

[Feng et al. "Masked Skill Token Training for Hierarchical Off-Dynamics Transfer." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/feng2026iclr-masked/)

BibTeX

@inproceedings{feng2026iclr-masked,
  title     = {{Masked Skill Token Training for Hierarchical Off-Dynamics Transfer}},
  author    = {Feng, Zeyu and Yin, Haiyan and Ong, Yew-Soon and Soh, Harold},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/feng2026iclr-masked/}
}