Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE

Wan, Guancheng; Fu, Lucheng; Liu, Haoxin; Jin, Yiqiao; Leong, Hui Yi; Jiang, Eric Hanchen; Geng, Hejia; Bi, Jinhe; Ma, Yunpu; Tang, Xiangru; Prakash, B. Aditya; Sun, Yizhou; Wang, Wei

Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE

Guancheng Wan, Lucheng Fu, Haoxin Liu, Yiqiao Jin, Hui Yi Leong, Eric Hanchen Jiang, Hejia Geng, Jinhe Bi, Yunpu Ma, Xiangru Tang, B. Aditya Prakash, Yizhou Sun, Wei Wang

ICLR 2026

/iclr/2026/wan2026iclr-beyond/

Abstract

The performance of Large Language Models (LLMs) hinges on carefully engineered prompts. However, prevailing prompt optimization methods, ranging from heuristic edits and reinforcement learning to evolutionary search, primarily target point-wise accuracy. They seldom enforce paraphrase invariance or searching stability, and therefore cannot remedy this brittleness in practice. Automated prompt search remains brittle: small, semantically preserving paraphrases often cause large performance swings. We identify this brittleness as the **textual sharpness** of the **prompt landscape**. In this work, we provide the first formal treatment of textual sharpness in the discrete, semantic space of prompts, together with an operational robustness criterion over a semantic neighborhood; the design is black-box or API-only, requiring no gradients to update the model's parameters. Then we introduce **TARE** (Textual Sharpness-Aware Evolving), a derivative-free framework that alternates between an inner, sampling-based adversarial search that stresses a prompt with hard paraphrases and an outer, robust selection that prefers candidates whose neighborhoods remain strong. We further propose **ATARE**, which learns anisotropic weights to shape the semantic neighborhood and adapts its radius over time to balance exploration and fidelity. Diverse tasks evaluate our methods, whose design for minimizing textual sharpness gap leads to prompts that preserve accuracy under paraphrasing, outperforming accuracy-only prompt search while remaining computationally practical. The code is available for anonymous access at https://anonymous.4open.science/r/ATARE_TARE/.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Wan et al. "Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE." International Conference on Learning Representations, 2026.

Markdown

[Wan et al. "Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wan2026iclr-beyond/)

BibTeX

@inproceedings{wan2026iclr-beyond,
  title     = {{Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE}},
  author    = {Wan, Guancheng and Fu, Lucheng and Liu, Haoxin and Jin, Yiqiao and Leong, Hui Yi and Jiang, Eric Hanchen and Geng, Hejia and Bi, Jinhe and Ma, Yunpu and Tang, Xiangru and Prakash, B. Aditya and Sun, Yizhou and Wang, Wei},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/wan2026iclr-beyond/}
}