Cowpox: Towards the Immunity of VLM-Based Multi-Agent Systems
Abstract
Vision Language Model (VLM) Agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language. Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property is robustness, stating that the system maintains its integrity during adversarial attacks. Multi-agent systems lack robustness, as a successful exploit against one agent can spread and infect other agents to undermine the entire system’s integrity. We propose a defense Cowpox to provably enhance the robustness of a multi-agent system by a distributed mechanism that improves the recovery rate of agents by limiting the expected number of infections to other agents. The core idea is to generate and distribute a special cure sample that immunizes an agent against the attack before exposure. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.
Cite
Text
Wu et al. "Cowpox: Towards the Immunity of VLM-Based Multi-Agent Systems." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Wu et al. "Cowpox: Towards the Immunity of VLM-Based Multi-Agent Systems." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/wu2025icml-cowpox/)BibTeX
@inproceedings{wu2025icml-cowpox,
title = {{Cowpox: Towards the Immunity of VLM-Based Multi-Agent Systems}},
author = {Wu, Yutong and Zhang, Jie and Li, Yiming and Zhang, Chao and Guo, Qing and Qiu, Han and Lukas, Nils and Zhang, Tianwei},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {68015-68035},
volume = {267},
url = {https://mlanthology.org/icml/2025/wu2025icml-cowpox/}
}