Your Agent May Misevolve: Emergent Risks in Self-Evolving LLM Agents
Abstract
Advances in Large Language Models (LLMs) have enabled a new class of \textbf{\textit{self-evolving agents}} that autonomously improve through environmental interaction, demonstrating strong capabilities. However, self-evolution also introduces novel risks overlooked by current safety research. In this work, we study case where an agent's self-evolution deviates in unintended ways, leading to undesirable or even harmful outcomes. We refer to this as \textit{\textbf{Misevolution}}. We evaluate misevolution along four key evolutionary pathways: model, memory, tool, and workflow. Our empirical findings reveal that misevolution is a widespread risk, affecting agents built even on top-tier LLMs (\textit{e.g.}, Gemini-2.5-Pro). Different emergent risks are observed, such as degradation of safety alignment after memory accumulation, or unintended introduction of vulnerabilities in tool creation and reuse. To our knowledge, this is the first study to systematically conceptualize misevolution and provide empirical evidence of its occurrence, highlighting an urgent need for new safety paradigms for self-evolving agents. Finally, we discuss potential mitigation strategies to inspire further research on building safer and more trustworthy self-evolving agents.
Cite
Text
Shao et al. "Your Agent May Misevolve: Emergent Risks in Self-Evolving LLM Agents." International Conference on Learning Representations, 2026.Markdown
[Shao et al. "Your Agent May Misevolve: Emergent Risks in Self-Evolving LLM Agents." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/shao2026iclr-your/)BibTeX
@inproceedings{shao2026iclr-your,
title = {{Your Agent May Misevolve: Emergent Risks in Self-Evolving LLM Agents}},
author = {Shao, Shuai and Ren, Qihan and Liu, Dongrui and Qian, Chen and Wei, Boyi and Guo, Dadi and JingYi, Yang and Song, Xinhao and Zhang, Linfeng and Zhang, Weinan and Shao, Jing},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/shao2026iclr-your/}
}