Recurrent World Model with Tokenized Latent States

Abstract

World models are getting more and more popular in recent years. We introduce a new architecture -- TokenWM, that maintains the recurrent nature of state-space models while incorporating tokenized latent states and a memory-augmented attention mechanism to improve modeling capacity in complex environments. The preliminary results on LIBERO benchmarks demonstrate that the new architecture is more favorable to complex tasks than the popular RSSM architecture. We believe TokenWM introduces a new design paradigm for recurrent world models, enabling more expressive and scalable decision-making in complex environments.

Cite

Text

Zhai et al. "Recurrent World Model with Tokenized Latent States." ICLR 2025 Workshops: World_Models, 2025.

Markdown

[Zhai et al. "Recurrent World Model with Tokenized Latent States." ICLR 2025 Workshops: World_Models, 2025.](https://mlanthology.org/iclrw/2025/zhai2025iclrw-recurrent/)

BibTeX

@inproceedings{zhai2025iclrw-recurrent,
  title     = {{Recurrent World Model with Tokenized Latent States}},
  author    = {Zhai, Guangyao and Zhang, Xingyuan and Navab, Nassir},
  booktitle = {ICLR 2025 Workshops: World_Models},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/zhai2025iclrw-recurrent/}
}