SmoothHess: ReLU Network Feature Interactions via Stein's Lemma

Abstract

Several recent methods for interpretability model feature interactions by looking at the Hessian of a neural network. This poses a challenge for ReLU networks, which are piecewise-linear and thus have a zero Hessian almost everywhere. We propose SmoothHess, a method of estimating second-order interactions through Stein's Lemma. In particular, we estimate the Hessian of the network convolved with a Gaussian through an efficient sampling algorithm, requiring only network gradient calls. SmoothHess is applied post-hoc, requires no modifications to the ReLU network architecture, and the extent of smoothing can be controlled explicitly. We provide a non-asymptotic bound on the sample complexity of our estimation procedure. We validate the superior ability of SmoothHess to capture interactions on benchmark datasets and a real-world medical spirometry dataset.

Cite

Text

Torop et al. "SmoothHess: ReLU Network Feature Interactions via Stein's Lemma." Neural Information Processing Systems, 2023.

Markdown

[Torop et al. "SmoothHess: ReLU Network Feature Interactions via Stein's Lemma." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/torop2023neurips-smoothhess/)

BibTeX

@inproceedings{torop2023neurips-smoothhess,
  title     = {{SmoothHess: ReLU Network Feature Interactions via Stein's Lemma}},
  author    = {Torop, Max and Masoomi, Aria and Hill, Davin and Kose, Kivanc and Ioannidis, Stratis and Dy, Jennifer},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/torop2023neurips-smoothhess/}
}