A Survey on Future Frame Synthesis: Bridging Deterministic and Generative Approaches

Abstract

Future Frame Synthesis (FFS), the task of generating subsequent video frames from context, represents a core challenge in machine intelligence and a cornerstone for developing predictive world models. This survey provides a comprehensive analysis of the FFS landscape, charting its critical evolution from deterministic algorithms focused on pixel-level accuracy to modern generative paradigms that prioritize semantic coherence and dynamic plausibility. We introduce a novel taxonomy organized by algorithmic stochasticity, which not only categorizes existing methods but also reveals the fundamental drivers—advances in architectures, datasets, and computational scale—behind this paradigm shift. Critically, our analysis identifies a bifurcation in the field's trajectory: one path toward efficient, real-time prediction, and another toward large-scale, generative world simulation. By pinpointing key challenges and proposing concrete research questions for both frontiers, this survey serves as an essential guide for researchers aiming to advance the frontiers of visual dynamic modeling.

Cite

Text

Ming et al. "A Survey on Future Frame Synthesis: Bridging Deterministic and Generative Approaches." Transactions on Machine Learning Research, 2025.

Markdown

[Ming et al. "A Survey on Future Frame Synthesis: Bridging Deterministic and Generative Approaches." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/ming2025tmlr-survey/)

BibTeX

@article{ming2025tmlr-survey,
  title     = {{A Survey on Future Frame Synthesis: Bridging Deterministic and Generative Approaches}},
  author    = {Ming, Ruibo and Huang, Zhewei and Wu, Jingwei and Ju, Zhuoxuan and Jiang, Daxin and Hu, Jianming and Peng, Lihui and Zhou, Shuchang},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/ming2025tmlr-survey/}
}