ML Anthology
Authors
Search
About
Xie, Zhixin
1 publications
NeurIPS
2025
Attack via Overfitting: 10-Shot Benign Fine-Tuning to Jailbreak LLMs
Zhixin Xie
,
Xurui Song
,
Jun Luo