Dixon, Lucas

8 publications

ICLR 2025 Scalable Influence and Fact Tracing for Large Language Model Pretraining Tyler A. Chang, Dheeraj Rajagopal, Tolga Bolukbasi, Lucas Dixon, Ian Tenney
ICML 2024 Decoding-Time Realignment of Language Models Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares-López, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel
ICML 2024 Interpretability Illusions in the Generalization of Simplified Models Dan Friedman, Andrew Kyle Lampinen, Lucas Dixon, Danqi Chen, Asma Ghandeharioun
ICML 2024 Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models Asma Ghandeharioun, Avi Caciularu, Adam Pearce, Lucas Dixon, Mor Geva
NeurIPS 2024 Who's Asking? User Personas and the Mechanics of Latent Misalignment Asma Ghandeharioun, Ann Yuan, Marius Guerard, Emily Reif, Michael A. Lepori, Lucas Dixon
NeurIPSW 2023 Comparing Representational and Functional Similarity in Small Transformer Language Models Dan Friedman, Andrew Kyle Lampinen, Lucas Dixon, Danqi Chen, Asma Ghandeharioun
NeurIPS 2022 Beyond Rewards: A Hierarchical Perspective on Offline Multiagent Behavioral Analysis Shayegan Omidshafiei, Andrei Kapishnikov, Yannick Assogba, Lucas Dixon, Been Kim
ICML 2022 GLaM: Efficient Scaling of Language Models with Mixture-of-Experts Nan Du, Yanping Huang, Andrew M Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten P Bosma, Zongwei Zhou, Tao Wang, Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc Le, Yonghui Wu, Zhifeng Chen, Claire Cui