Janiak, Jett

3 publications

ICLRW 2025 Chain-of-Thought Reasoning in the Wild Is Not Always Faithful Iván Arcuschin, Jett Janiak, Robert Krzyzanowski, Senthooran Rajamanoharan, Neel Nanda, Arthur Conmy
ICMLW 2024 An Adversarial Example for Direct Logit Attribution: Memory Management in GELU-4L Jett Janiak, Can Rager, James Dao, Yeu-Tong Lau
NeurIPSW 2024 Characterizing Stable Regions in the Residual Stream of LLMs Jett Janiak, Jacek Karwowski, Chatrik Singh Mangat, Giorgi Giglemiani, Nora Petrova, Stefan Heimersheim