Minder, Julian

4 publications

ICLR 2025 Controllable Context Sensitivity and the Knob Behind It Julian Minder, Kevin Du, Niklas Stoehr, Giovanni Monea, Chris Wendler, Robert West, Ryan Cotterell
NeurIPS 2025 Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning Julian Minder, Clément Dumas, Caden Juang, Bilal Chughtai, Neel Nanda
ICLRW 2025 Robustly Identifying Concepts Introduced During Chat Fine-Tuning Using Crosscoders Julian Minder, Clément Dumas, Bilal Chughtai, Neel Nanda
NeurIPS 2025 The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? Denis Sutter, Julian Minder, Thomas Hofmann, Tiago Pimentel