Hedström, Anna

13 publications

NeurIPS 2025 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework Laura Kopf, Nils Feldhus, Kirill Bykov, Philine Lou Bommer, Anna Hedström, Marina MC Höhne, Oliver Eberle
AAAI 2025 Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution Carlos Eiras-Franco, Anna Hedström, Marina M.-C. Höhne
TMLR 2025 Evaluating Interpretable Methods via Geometric Alignment of Functional Distortions Anna Hedström, Philine Lou Bommer, Thomas F Burns, Sebastian Lapuschkin, Wojciech Samek, Marina MC Höhne
ICML 2025 To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models Anna Hedström, Salim I. Amoukou, Tom Bewley, Saumitra Mishra, Manuela Veloso
NeurIPS 2024 CoSy: Evaluating Textual Explanations of Neurons Laura Kopf, Philine Lou Bommer, Anna Hedström, Sebastian Lapuschkin, Marina M.-C. Höhne, Kirill Bykov
ICMLW 2024 CoSy: Evaluating Textual Explanations of Neurons Laura Kopf, Philine Lou Bommer, Anna Hedström, Sebastian Lapuschkin, Marina MC Höhne, Kirill Bykov
ICMLW 2024 CoSy: Evaluating Textual Explanations of Neurons Laura Kopf, Philine Lou Bommer, Anna Hedström, Sebastian Lapuschkin, Marina MC Höhne, Kirill Bykov
ECCVW 2024 From Flexibility to Manipulation: The Slippery Slope of XAI Evaluation Kristoffer Wickstrøm, Marina M.-C. Höhne, Anna Hedström
NeurIPSW 2024 The Price of Freedom: An Adversarial Attack on Interpretability Evaluation Kristoffer Knutsen Wickstrøm, Marina MC Höhne, Anna Hedström
MLOSS 2023 Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond Anna Hedström, Leander Weber, Daniel Krakowczyk, Dilyara Bareeva, Franz Motzkus, Wojciech Samek, Sebastian Lapuschkin, Marina M.-C. Höhne
NeurIPSW 2023 Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test Anna Hedström, Leander Weber, Sebastian Lapuschkin, Marina MC Höhne
TMLR 2023 The Meta-Evaluation Problem in Explainable AI: Identifying Reliable Estimators with MetaQuantus Anna Hedström, Philine Lou Bommer, Kristoffer Knutsen Wickstrøm, Wojciech Samek, Sebastian Lapuschkin, Marina MC Höhne
AAAI 2022 NoiseGrad - Enhancing Explanations by Introducing Stochasticity to Model Weights Kirill Bykov, Anna Hedström, Shinichi Nakajima, Marina M.-C. Höhne