Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization
Abstract
Pre-trained large language models (LLMs) are commonly fine-tuned to adapt to downstream tasks. Since the majority of knowledge is acquired during pre-training, attributing the predictions of fine-tuned LLMs to their pre-training data may provide valuable insights. Influence functions have been proposed as a means to explain model predictions based on training data. However, existing approaches often fail to compute "multi-stage" influence and lack scalability to billion-scale LLMs. In this paper, we propose multi-stage influence functions to attribute the downstream predictions of fine-tuned LLMs to pre-training data under the full-parameter fine-tuning paradigm. To enhance the efficiency and practicality of our multi-stage influence function, we leverage Eigenvalue-corrected Kronecker-Factored (EK-FAC) parameterization for efficient approximation. Empirical results validate the superior scalability of EK-FAC approximation and the effectiveness of our multi-stage influence function. Additionally, case studies on a real-world LLM, dolly-v2-3b, demonstrate its interpretive power, with exemplars illustrating insights provided by multi-stage influence estimates.
Cite
Text
Bao et al. "Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/892Markdown
[Bao et al. "Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/bao2025ijcai-scalable/) doi:10.24963/IJCAI.2025/892BibTeX
@inproceedings{bao2025ijcai-scalable,
title = {{Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization}},
author = {Bao, Yuntai and Zhang, Xuhong and Du, Tianyu and Zhao, Xinkui and Zong, Jiang and Peng, Hao and Yin, Jianwei},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {8022-8030},
doi = {10.24963/IJCAI.2025/892},
url = {https://mlanthology.org/ijcai/2025/bao2025ijcai-scalable/}
}