GlanceNets: Interpretable, Leak-Proof Concept-Based Models

Abstract

There is growing interest in concept-based models (CBMs) that combine high performance and interpretability by acquiring and reasoning with a vocabulary of high-level concepts. A key requirement is that the concepts be interpretable. Existing CBMs tackle this desideratum using a variety of heuristics based on unclear notions of interpretability, and fail to acquire concepts with the intended semantics. We address this by providing a clear definition of interpretability in terms of alignment between the model’s representation and an underlying data generation process, and introduce GlanceNets, a new CBM that exploits techniques from causal disentangled representation learning and open-set recognition to achieve alignment, thus improving the interpretability of the learned concepts. We show that GlanceNets, paired with concept-level supervision, achieve better alignment than state-of-the-art approaches while preventing spurious information from unintendedly leaking into the learned concepts.

Cite

Text

Marconato et al. "GlanceNets: Interpretable, Leak-Proof Concept-Based Models." NeurIPS 2022 Workshops: nCSI, 2022.

Markdown

[Marconato et al. "GlanceNets: Interpretable, Leak-Proof Concept-Based Models." NeurIPS 2022 Workshops: nCSI, 2022.](https://mlanthology.org/neuripsw/2022/marconato2022neuripsw-glancenets/)

BibTeX

@inproceedings{marconato2022neuripsw-glancenets,
  title     = {{GlanceNets: Interpretable, Leak-Proof Concept-Based Models}},
  author    = {Marconato, Emanuele and Passerini, Andrea and Teso, Stefano},
  booktitle = {NeurIPS 2022 Workshops: nCSI},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/marconato2022neuripsw-glancenets/}
}