Better Training Data Attribution via Better Inverse Hessian-Vector Products

Abstract

Training data attribution (TDA) provides insights into which training data is responsible for a learned model behavior. Gradient-based TDA methods such as influence functions and unrolled differentiation both involve a computation that resembles an inverse Hessian-vector product (iHVP), which is difficult to approximate efficiently. We introduce an algorithm (ASTRA) which uses the EKFAC-preconditioner on Neumann series iterations to arrive at an accurate iHVP approximation for TDA. ASTRA is easy to tune, requires fewer iterations than Neumann series iterations, and is more accurate than EKFAC-based approximations. Using ASTRA, we show that improving the accuracy of the iHVP approximation can significantly improve TDA performance.

Cite

Text

Wang et al. "Better Training Data Attribution via Better Inverse Hessian-Vector Products." Advances in Neural Information Processing Systems, 2025.

Markdown

[Wang et al. "Better Training Data Attribution via Better Inverse Hessian-Vector Products." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/wang2025neurips-better/)

BibTeX

@inproceedings{wang2025neurips-better,
  title     = {{Better Training Data Attribution via Better Inverse Hessian-Vector Products}},
  author    = {Wang, Andrew and Nguyen, Elisa and Yang, Runshi and Bae, Juhan and McIlraith, Sheila A. and Grosse, Roger Baker},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/wang2025neurips-better/}
}