DreamGen: Unlocking Generalization in Robot Learning Through Video World Models

Jang, Joel; Ye, Seonghyeon; Lin, Zongyu; Xiang, Jiannan; Bjorck, Johan; Fang, Yu; Hu, Fengyuan; Huang, Spencer; Kundalia, Kaushil; Lin, Yen-Chen; Magne, Loïc; Mandlekar, Ajay; Narayan, Avnish; Tan, You Liang; Wang, Guanzhi; Wang, Jing; Wang, Qi; Xu, Yinzhen; Zeng, Xiaohui; Zheng, Kaiyuan; Zheng, Ruijie; Liu, Ming-Yu; Zettlemoyer, Luke; Fox, Dieter; Kautz, Jan; Reed, Scott; Zhu, Yuke; Fan, Linxi

DreamGen: Unlocking Generalization in Robot Learning Through Video World Models

CoRL 2025 pp. 5170-5194

/corl/2025/jang2025corl-dreamgen/

Abstract

In this work, we unlock new capabilities in robot learning from neural trajectories, synthetic robot data generated from video world models. Our proposed recipe is simple, but powerful: we take the most recent state-of-the-art video generative models (world models), adapt them to the target robot embodiment, and generate new, synthetic robot data of the same task or even new behaviors. Since these video world models only generate videos, we explore two techniques of getting robot actions: extracting latent actions from a general-purpose latent action model and getting predicted actions from an inverse-dynamics model (IDM), giving flexibility across diverse scenarios. Our proposed approach unlocks behavior and environment generalization, allowing a humanoid robot to perform 20+ new behaviors in unseen environments while only collecting teleoperation data for pick and place in a single environment. By introducing a new world modeling benchmark, we demonstrate that stronger video world models directly correlate with improved downstream robot policy performance. This establishes a new scaling dimension beyond simply collecting additional teleoperation data, changing how we approach robot learning.

PDF CoRL OpenReview Semantic Scholar

Cite

Text

Jang et al. "DreamGen: Unlocking Generalization in Robot Learning Through Video World Models." Proceedings of The 9th Conference on Robot Learning, 2025.

Markdown

[Jang et al. "DreamGen: Unlocking Generalization in Robot Learning Through Video World Models." Proceedings of The 9th Conference on Robot Learning, 2025.](https://mlanthology.org/corl/2025/jang2025corl-dreamgen/)

BibTeX

@inproceedings{jang2025corl-dreamgen,
  title     = {{DreamGen: Unlocking Generalization in Robot Learning Through Video World Models}},
  author    = {Jang, Joel and Ye, Seonghyeon and Lin, Zongyu and Xiang, Jiannan and Bjorck, Johan and Fang, Yu and Hu, Fengyuan and Huang, Spencer and Kundalia, Kaushil and Lin, Yen-Chen and Magne, Loïc and Mandlekar, Ajay and Narayan, Avnish and Tan, You Liang and Wang, Guanzhi and Wang, Jing and Wang, Qi and Xu, Yinzhen and Zeng, Xiaohui and Zheng, Kaiyuan and Zheng, Ruijie and Liu, Ming-Yu and Zettlemoyer, Luke and Fox, Dieter and Kautz, Jan and Reed, Scott and Zhu, Yuke and Fan, Linxi},
  booktitle = {Proceedings of The 9th Conference on Robot Learning},
  year      = {2025},
  pages     = {5170-5194},
  volume    = {305},
  url       = {https://mlanthology.org/corl/2025/jang2025corl-dreamgen/}
}