There Was Never a Bottleneck in Concept Bottleneck Models
Abstract
Deep learning representations are often difficult to interpret, which can hinder their deployment in sensitive applications. Concept Bottleneck Models (CBMs) have emerged as a promising approach to mitigate this issue by learning representations that support target task performance while ensuring that each component predicts a concrete concept from a predefined set. In this work, we argue that CBMs do not impose a true bottleneck: the fact that a component can predict a concept does not guarantee that it encodes only information about that concept. This shortcoming raises concerns regarding interpretability and the validity of intervention procedures. To overcome this limitation, we propose Minimal Concept Bottleneck Models (MCBMs), which incorporate an Information Bottleneck (IB) objective to constrain each representation component to retain only the information relevant to its corresponding concept. This IB is implemented via a variational regularization term added to the training loss. As a result, MCBMs yield more interpretable representations, support principled concept-level interventions, and remain consistent with probability-theoretic foundations.
Cite
Text
Almudévar et al. "There Was Never a Bottleneck in Concept Bottleneck Models." International Conference on Learning Representations, 2026.Markdown
[Almudévar et al. "There Was Never a Bottleneck in Concept Bottleneck Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/almudevar2026iclr-there/)BibTeX
@inproceedings{almudevar2026iclr-there,
title = {{There Was Never a Bottleneck in Concept Bottleneck Models}},
author = {Almudévar, Antonio and Hernández-Lobato, José Miguel and Ortega, Alfonso},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/almudevar2026iclr-there/}
}