Buesser, Beat
5 publications
NeurIPSW
2024
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
NeurIPSW
2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI