Ravfogel, Shauli

9 publications

NeurIPS 2025 Emergence of Linear Truth Encodings in Language Models Shauli Ravfogel, Gilad Yehudai, Tal Linzen, Joan Bruna, Alberto Bietti
ICLR 2025 Gumbel Counterfactual Generation from Language Models Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson, Ryan Cotterell
ICLRW 2025 Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces Yihuai Hong, Lei Yu, Haiqin Yang, Shauli Ravfogel, Mor Geva
NeurIPS 2025 Preserving Task-Relevant Information Under Linear Concept Removal Floris Holstege, Shauli Ravfogel, Bram Wouters
NeurIPS 2024 On Affine Homotopy Between Language Encoders Robin S. M. Chan, Reda Boumasmoud, Anej Svete, Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Mennatallah El-Assady, Ryan Cotterell
ICML 2024 Representation Surgery: Theory and Practice of Affine Steering Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell, Ponnurangam Kumaraguru
NeurIPS 2023 LEACE: Perfect Linear Concept Erasure in Closed Form Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman
NeurIPS 2023 Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence Through Attention mAP Alignment Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal Chechik
ICML 2022 Linear Adversarial Concept Erasure Shauli Ravfogel, Michael Twiton, Yoav Goldberg, Ryan D Cotterell