An Additive Instance-Wise Approach to Multi-Class Model Interpretation

Vy Vo, Van Nguyen, Trung Le, Quan Hung Tran, Gholamreza Haffari, Seyit Camtepe, Dinh Phung

ICLR 2023

/iclr/2023/vo2023iclr-additive/

Abstract

Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. A large number of interpreting methods focus on identifying explanatory input features, which generally fall into two main categories: attribution and selection. A popular attribution-based approach is to exploit local neighborhoods for learning instance-specific explainers in an additive manner. The process is thus inefficient and susceptible to poorly-conditioned samples. Meanwhile, many selection-based methods directly optimize local feature distributions in an instance-wise training framework, thereby being capable of leveraging global information from other inputs. However, they can only interpret single-class predictions and many suffer from inconsistency across different settings, due to a strict reliance on a pre-defined number of features selected. This work exploits the strengths of both methods and proposes a framework for learning local explanations simultaneously for multiple target classes. Our model explainer significantly outperforms additive and instance-wise counterparts on faithfulness with more compact and comprehensible explanations. We also demonstrate the capacity to select stable and important features through extensive experiments on various data sets and black-box model architectures.

PDF ICLR Semantic Scholar

Cite

Text

Vo et al. "An Additive Instance-Wise Approach to Multi-Class Model Interpretation." International Conference on Learning Representations, 2023.

Markdown

[Vo et al. "An Additive Instance-Wise Approach to Multi-Class Model Interpretation." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/vo2023iclr-additive/)

BibTeX

@inproceedings{vo2023iclr-additive,
  title     = {{An Additive Instance-Wise Approach to Multi-Class Model Interpretation}},
  author    = {Vo, Vy and Nguyen, Van and Le, Trung and Tran, Quan Hung and Haffari, Gholamreza and Camtepe, Seyit and Phung, Dinh},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/vo2023iclr-additive/}
}