Field, Severin

1 publications

NeurIPSW 2024 What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks Nathalie Maria Kirch, Severin Field, Stephen Casper