SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Abstract
Surrogate models are used to predict the behavior of complex energy systems that are too expensive to simulate with traditional numerical methods. Our work introduces the use of language descriptions, which we call "system captions" or SysCaps, to interface with such surrogates. We argue that interacting with surrogates through text, particularly natural language, makes these models more accessible for both experts and non-experts. We introduce a lightweight multimodal text and timeseries regression model and a training pipeline that uses large language models (LLMs) to synthesize high-quality captions from simulation metadata. Our experiments on two real-world simulators of buildings and wind farms show that our SysCaps-augmented surrogates have better accuracy on held-out systems than traditional methods while enjoying new generalization abilities, such as handling semantically related descriptions of the same test system. Additional experiments also highlight the potential of SysCaps to unlock language-driven design space exploration and to regularize training through prompt augmentation.
Cite
Text
Emami et al. "SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems." International Conference on Learning Representations, 2025.Markdown
[Emami et al. "SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/emami2025iclr-syscaps/)BibTeX
@inproceedings{emami2025iclr-syscaps,
title = {{SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems}},
author = {Emami, Patrick and Li, Zhaonan and Sinha, Saumya and Nguyen, Truc},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/emami2025iclr-syscaps/}
}