Li, Maximilian

4 publications

ICLR 2025 Endless Jailbreaks with Bijection Learning Brian R.Y. Huang, Maximilian Li, Leonard Tang
NeurIPS 2024 Optimal Ablation for Interpretability Maximilian Li, Lucas Janson
ICMLW 2023 Circuit Breaking: Removing Model Behaviors with Targeted Ablation Maximilian Li, Xander Davies, Max Nadeau
NeurIPS 2023 Information Maximizing Curriculum: A Curriculum-Based Approach for Learning Versatile Skills Denis Blessing, Onur Celik, Xiaogang Jia, Moritz Reuss, Maximilian Li, Rudolf Lioutikov, Gerhard Neumann