Dziemian, Mateusz

3 publications

ICLR 2025 AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, J Zico Kolter, Matt Fredrikson, Yarin Gal, Xander Davies
NeurIPS 2025 Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition Andy Zou, Maxwell Lin, Eliot Krzysztof Jones, Micha V. Nowak, Mateusz Dziemian, Nick Winter, Valent Nathanael, Ayla Croft, Xander Davies, Jai Patel, Robert Kirk, Yarin Gal, Dan Hendrycks, J Zico Kolter, Matt Fredrikson
NeurIPSW 2024 Applying Refusal-Vector Ablation to Llama 3.1 70b Agents Simon Lermen, Mateusz Dziemian, Govind Pimpale