LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions
Abstract
Why do gradient-based explanations struggle with Transformers, and how can we improve them? We identify gradient flow imbalances in Transformers that violate FullGrad-completeness, a critical property for attribution faithfulness that CNNs naturally possess. To address this issue, we introduce LibraGrad--a theoretically grounded post-hoc approach that corrects gradient imbalances through pruning and scaling of backward paths, without changing the forward pass or adding computational overhead. We evaluate LibraGrad using three metric families: Faithfulness, which quantifies prediction changes under perturbations of the most and least relevant features; Completeness Error, which measures attribution conservation relative to model outputs; and Segmentation AP, which assesses alignment with human perception. Extensive experiments across 8 architectures, 4 model sizes, and 5 datasets show that LibraGrad universally enhances gradient-based methods, outperforming existing white-box methods--including Transformer-specific approaches--across all metrics. We demonstrate superior qualitative results through two complementary evaluations: precise text-prompted region highlighting on CLIP models and accurate class discrimination between co-occurring animals on ImageNet-finetuned models--two settings on which existing methods often struggle. LibraGrad is effective even on the attention-free MLP-Mixer architecture, indicating potential for extension to other modern architectures. Our code is freely available at https://nightmachinery.github.io/LibraGrad/.
Cite
Text
Mehri et al. "LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00016Markdown
[Mehri et al. "LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/mehri2025cvpr-libragrad/) doi:10.1109/CVPR52734.2025.00016BibTeX
@inproceedings{mehri2025cvpr-libragrad,
title = {{LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions}},
author = {Mehri, Faridoun and Baghshah, Mahdieh Soleymani and Pilehvar, Mohammad Taher},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2025},
pages = {67-78},
doi = {10.1109/CVPR52734.2025.00016},
url = {https://mlanthology.org/cvpr/2025/mehri2025cvpr-libragrad/}
}