Dunefsky, Jacob

3 publications

ICML 2024 Observable Propagation: Uncovering Feature Vectors in Transformers Jacob Dunefsky, Arman Cohan
NeurIPS 2024 Transcoders Find Interpretable LLM Feature Circuits Jacob Dunefsky, Philippe Chlenski, Neel Nanda
ICMLW 2024 Transcoders Find Interpretable LLM Feature Circuits Jacob Dunefsky, Philippe Chlenski, Neel Nanda