Selective Concept Bottleneck Models Without Predefined Concepts
Abstract
Concept-based models like Concept Bottleneck Models (CBMs) have garnered significant interest for improving model interpretability by first predicting human-understandable concepts before mapping them to the output classes. Early approaches required costly concept annotations. To alleviate this, recent methods utilized large language models to automatically generate class-specific concept descriptions and learned mappings from a pretrained black-box model’s raw features to these concepts using vision-language models. However, these approaches assume prior knowledge of which concepts the black-box model has learned. In this work, we discover the concepts encoded by the model through unsupervised concept discovery techniques instead. We further leverage a simple input-dependent concept selection mechanism that dynamically retains a sparse set of relevant concepts of each input, enhancing both sparsity and interpretability. Our approach not only improves downstream performance, but also needs significantly fewer concepts for accurate classification. Lastly, we show how large vision-language models can guide the editing of our models' weights to correct model errors.
Cite
Text
Schrodi et al. "Selective Concept Bottleneck Models Without Predefined Concepts." Transactions on Machine Learning Research, 2025.Markdown
[Schrodi et al. "Selective Concept Bottleneck Models Without Predefined Concepts." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/schrodi2025tmlr-selective/)BibTeX
@article{schrodi2025tmlr-selective,
title = {{Selective Concept Bottleneck Models Without Predefined Concepts}},
author = {Schrodi, Simon and Schur, Julian and Argus, Max and Brox, Thomas},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/schrodi2025tmlr-selective/}
}