Escaping Low-Rank Traps: Interpretable Visual Concept Learning via Implicit Vector Quantization
Abstract
Concept Bottleneck Models (CBMs) achieve interpretability by interposing a human-understandable concept layer between perception and label prediction. We first identify that the condition of \textit{many-to-many} mapping is necessary for robust CBMs, a prerequisite that has been largely overlooked in previous approaches. While several recent methods have attempted to establish this relationship, we observe that they suffer from the fundamental issue of \textit{representation collapse}, where visual patch features degenerate into a low-rank subspace during training, severely degrading the quality of learned concept activation vectors, thus hindering both model interpretability and downstream performance. To address these issues, we propose Implicit Vector Quantization (IVQ), a lightweight regularizer that maintains high-rank, diverse representations throughout training. Rather than imposing a hard bottleneck via direct quantization, IVQ learns a codebook prior that anchors semantic information in visual features, allowing it to act as a proxy objective. To further exploit these high-rank concept-aware features, we propose Magnet Attention, which dynamically aggregates patch-level features into visual concept prototypes, explicitly modeling the many-to-many vision–concept correspondence. Extensive experimental results show that our approach effectively prevents representational collapse and achieves state-of-the-art performance on diverse benchmarks. Our experiments further probe the low-rank phenomenon in representational collapse, finding that IVQ mitigates the information bottleneck and yields cross-modal representations with clearer, more interpretable consistency. Code is available at \url{https://github.com/Daryl-GSJ/IVQ-CBM}.
Cite
Text
Gao et al. "Escaping Low-Rank Traps: Interpretable Visual Concept Learning via Implicit Vector Quantization." International Conference on Learning Representations, 2026.Markdown
[Gao et al. "Escaping Low-Rank Traps: Interpretable Visual Concept Learning via Implicit Vector Quantization." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/gao2026iclr-escaping/)BibTeX
@inproceedings{gao2026iclr-escaping,
title = {{Escaping Low-Rank Traps: Interpretable Visual Concept Learning via Implicit Vector Quantization}},
author = {Gao, Shujian and Wang, Yuan and Ma, Chenglong and Gao, Xin and Yan, Jiangtao and Ning, Junzhi and Tang, Cheng and Ji, Changkai and Xu, Huihui and Li, Wei and Huang, Ziyan and Lin, Jiashi and Hu, Ming and Liu, Jiyao and Tang, Wenhao and Du, Ye and Li, Tianbin and Ye, Jin and He, Junjun},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/gao2026iclr-escaping/}
}