Understanding Inter-Concept Relationships in Concept-Based Models

Abstract

Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between concepts when solving tasks, it is unclear whether concept-based methods incorporate the rich structure of inter-concept relationships. We analyse the concept representations learnt by concept-based models to understand whether these models correctly capture inter-concept relationships. First, we empirically demonstrate that state-of-the-art concept-based models produce representations that lack stability and robustness, and such methods fail to capture inter-concept relationships. Then, we develop a novel algorithm which leverages inter-concept relationships to improve concept intervention accuracy, demonstrating how correctly capturing inter-concept relationships can improve downstream tasks.

Cite

Text

Raman et al. "Understanding Inter-Concept Relationships in Concept-Based Models." International Conference on Machine Learning, 2024.

Markdown

[Raman et al. "Understanding Inter-Concept Relationships in Concept-Based Models." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/raman2024icml-understanding/)

BibTeX

@inproceedings{raman2024icml-understanding,
  title     = {{Understanding Inter-Concept Relationships in Concept-Based Models}},
  author    = {Raman, Naveen Janaki and Espinosa Zarlenga, Mateo and Jamnik, Mateja},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {42009-42025},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/raman2024icml-understanding/}
}