Learning How to Explain Neural Networks: PatternNet and PatternAttribution

Pieter-Jan Kindermans, Kristof T. Schütt, Maximilian Alber, Klaus-Robert Müller, Dumitru Erhan, Been Kim, Sven Dähne

ICLR 2018

/iclr/2018/kindermans2018iclr-learning/

Abstract

DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.

PDF ICLR Code Semantic Scholar

Cite

Text

Kindermans et al. "Learning How to Explain Neural Networks: PatternNet and PatternAttribution." International Conference on Learning Representations, 2018.

Markdown

[Kindermans et al. "Learning How to Explain Neural Networks: PatternNet and PatternAttribution." International Conference on Learning Representations, 2018.](https://mlanthology.org/iclr/2018/kindermans2018iclr-learning/)

BibTeX

@inproceedings{kindermans2018iclr-learning,
  title     = {{Learning How to Explain Neural Networks: PatternNet and PatternAttribution}},
  author    = {Kindermans, Pieter-Jan and Schütt, Kristof T. and Alber, Maximilian and Müller, Klaus-Robert and Erhan, Dumitru and Kim, Been and Dähne, Sven},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  url       = {https://mlanthology.org/iclr/2018/kindermans2018iclr-learning/}
}