Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals

Stojanov, Stefan; Wendt, David; Kim, Seungwoo; Venkatesh, Rahul Mysore; Feigelis, Kevin; Kotar, Klemen; Aw, Khai Loong; Wu, Jiajun; Yamins, Daniel LK

Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals

Stefan Stojanov, David Wendt, Seungwoo Kim, Rahul Mysore Venkatesh, Kevin Feigelis, Klemen Kotar, Khai Loong Aw, Jiajun Wu, Daniel LK Yamins

NeurIPS 2025

/neurips/2025/stojanov2025neurips-selfsupervised/

Abstract

Estimating motion primitives from video (e.g., optical flow and occlusion) is a critically important computer vision problem with many downstream applications, including controllable video generation and robotics. Current solutions are primarily supervised on synthetic data or require tuning of situation-specific heuristics, which inherently limits these models' capabilities in real-world contexts. A natural solution to transcend these limitations would be to deploy large-scale, self-supervised video models, which can be trained scalably on unrestricted real-world video datasets. However, despite recent progress, motion-primitive extraction from large pretrained video models remains relatively underexplored. In this work, we describe Opt-CWM, a self-supervised flow and occlusion estimation technique from a pretrained video prediction model. Opt-CWM uses ``counterfactual probes'' to extract motion information from a base video model in a zero-shot fashion. The key problem we solve is optimizing the quality of these probes, using a combination of an efficient parameterization of the space counterfactual probes, together with a novel generic sparse-prediction principle for learning the probe-generation parameters in a self-supervised fashion. Opt-CWM achieves state-of-the-art performance for motion estimation on real-world videos while requiring no labeled data.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Stojanov et al. "Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals." Advances in Neural Information Processing Systems, 2025.

Markdown

[Stojanov et al. "Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/stojanov2025neurips-selfsupervised/)

BibTeX

@inproceedings{stojanov2025neurips-selfsupervised,
  title     = {{Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals}},
  author    = {Stojanov, Stefan and Wendt, David and Kim, Seungwoo and Venkatesh, Rahul Mysore and Feigelis, Kevin and Kotar, Klemen and Aw, Khai Loong and Wu, Jiajun and Yamins, Daniel LK},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/stojanov2025neurips-selfsupervised/}
}