ML Anthology
Authors
Search
About
Minder, Julian
4 publications
ICLR
2025
Controllable Context Sensitivity and the Knob Behind It
Julian Minder
,
Kevin Du
,
Niklas Stoehr
,
Giovanni Monea
,
Chris Wendler
,
Robert West
,
Ryan Cotterell
NeurIPS
2025
Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
Julian Minder
,
Clément Dumas
,
Caden Juang
,
Bilal Chughtai
,
Neel Nanda
ICLRW
2025
Robustly Identifying Concepts Introduced During Chat Fine-Tuning Using Crosscoders
Julian Minder
,
Clément Dumas
,
Bilal Chughtai
,
Neel Nanda
NeurIPS
2025
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
Denis Sutter
,
Julian Minder
,
Thomas Hofmann
,
Tiago Pimentel