VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction
Abstract
In-Context Operator Networks (ICONs) have demonstrated the ability to learn operators across diverse partial differential equations using few-shot, in-context learning. However, existing ICONs process each spatial point as an individual token, severely limiting computational efficiency when handling dense data in higher spatial dimensions. We propose \textit{Vision In-Context Operator Networks} (VICON), which integrate vision transformer architectures to efficiently process 2D data through patch-wise operations while preserving ICON's adaptability to multi-physics systems and varying timesteps. Evaluated across three fluid dynamics benchmarks, VICON significantly outperforms state-of-the-art baselines DPOT and MPP, reducing the average last-step rollout error by 37.9\% compared to DPOT and 44.7\% compared to MPP, while requiring only 72.5\% and 34.8\% of their respective inference times. VICON naturally supports flexible rollout strategies with varying timestep strides, enabling immediate deployment in \textit{imperfect measurement systems} where sampling frequencies may differ or frames might be dropped—common challenges in real-world settings—without requiring retraining or interpolation. In these realistic scenarios, VICON exhibits remarkable robustness, experiencing only 24.41\% relative performance degradation compared to 71.37\%-74.49\% degradation in baseline methods, demonstrating its versatility for deployment in realistic applications. Our scripts for processing datasets and code are publicly available at https://github.com/Eydcao/VICON.
Cite
Text
Cao et al. "VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction." Transactions on Machine Learning Research, 2026.Markdown
[Cao et al. "VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/cao2026tmlr-vicon/)BibTeX
@article{cao2026tmlr-vicon,
title = {{VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction}},
author = {Cao, Yadi and Liu, Yuxuan and Yang, Liu and Yu, Rose and Schaeffer, Hayden and Osher, Stanley},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/cao2026tmlr-vicon/}
}