Miehling, Erik

5 publications

ICLR 2025 Programming Refusal with Conditional Activation Steering Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Erik Miehling, Pierre Dognin, Manish Nagireddy, Amit Dhurandhar
NeurIPSW 2024 Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI Ambrish Rawat, Stefan Schoepf, Giulio Zizzo, Giandomenico Cornacchia, Muhammad Zaid Hameed, Kieran Fraser, Erik Miehling, Beat Buesser, Elizabeth M. Daly, Mark Purcell, Prasanna Sattigeri, Pin-Yu Chen, Kush R. Varshney
NeurIPSW 2024 Evaluating the Prompt Steerability of Large Language Models Erik Miehling, Michael Desmond, Karthikeyan Natesan Ramamurthy, Elizabeth M. Daly, Pierre Dognin, Jesus Rios, Djallel Bouneffouf, Miao Liu
NeurIPS 2023 Cookie Consent Has Disparate Impact on Estimation Accuracy Erik Miehling, Rahul Nair, Elizabeth Daly, Karthikeyan Natesan Ramamurthy, Robert Redmond
NeurIPS 2019 Non-Cooperative Inverse Reinforcement Learning Xiangyuan Zhang, Kaiqing Zhang, Erik Miehling, Tamer Basar