Icard, Thomas

13 publications

JMLR 2025 Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability Atticus Geiger, Duligur Ibeling, Amir Zur, Maheep Chaudhary, Sonakshi Chauhan, Jing Huang, Aryaman Arora, Zhengxuan Wu, Noah Goodman, Christopher Potts, Thomas Icard

ICML 2025 Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors Jing Huang, Junyi Tao, Thomas Icard, Diyi Yang, Christopher Potts

CLeaR 2024 Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations Atticus Geiger, Zhengxuan Wu, Christopher Potts, Thomas Icard, Noah Goodman

CLeaR 2023 Causal Abstraction with Soft Interventions Riccardo Massidda, Atticus Geiger, Thomas Icard, Davide Bacciu

NeurIPS 2023 Comparing Causal Frameworks: Potential Outcomes, Structural Models, Graphs, and Abstractions Duligur Ibeling, Thomas Icard

TMLR 2023 Holistic Evaluation of Language Models Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D Manning, Christopher Re, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri S. Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Andrew Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda

NeurIPS 2023 Interpretability at Scale: Identifying Causal Mechanisms in Alpaca Zhengxuan Wu, Atticus Geiger, Thomas Icard, Christopher Potts, Noah Goodman

ICML 2022 Inducing Causal Structure for Interpretable Neural Networks Atticus Geiger, Zhengxuan Wu, Hanson Lu, Josh Rozner, Elisa Kreiss, Thomas Icard, Noah Goodman, Christopher Potts

NeurIPS 2021 A Topological Perspective on Causal Inference Duligur Ibeling, Thomas Icard

NeurIPS 2021 Causal Abstractions of Neural Networks Atticus Geiger, Hanson Lu, Thomas Icard, Christopher Potts

AAAI 2020 Probabilistic Reasoning Across the Causal Hierarchy Duligur Ibeling, Thomas Icard

UAI 2019 On Open-Universe Causal Reasoning Duligur Ibeling, Thomas Icard

IJCAI 2018 On the Conditional Logic of Simulation Models Duligur Ibeling, Thomas Icard