Learning Separable Fine-Grained Representation via Dendrogram Construction from Coarse Labels for Fine-Grained Visual Recognition
Abstract
Learning fine-grained representations from coarse labels for fine-grained visual recognition (FGVR) is a challenging yet valuable task, as it alleviates the reliance on labor-intensive fine-grained annotations. Early approaches focused primarily on minimizing intra-fine-grained-class variation but overlooked inter-fine-grained-class separability, resulting in limited FGVR performance. Subsequent studies employed a top-down paradigm to enhance separability via deep clustering, yet these methods require predefining the number of fine-grained classes, which is often impractical to obtain. Here, we introduce a bottom-up learning paradigm that constructs a hierarchical dendrogram by iteratively merging similar instances/clusters, inferring higher-level semantics from lowest-level instances without predefining class numbers. Leveraging this, we propose BuCSFR, a novel method that integrates a Bottom-up Construction (BuC) module to build the dendrogram based on a minimal information loss criterion, and a Separable Fine-grained Representation (SFR) module that treats dendrogram nodes as pseudo-labels to ensure representation separability. The synergistic interaction between these modules enables iterative enhancement, grounded theoretically in the Expectation-Maximization (EM) framework. Extensive experiments on five benchmark datasets demonstrate the superiority of our approach, showcasing its effectiveness in learning separable representations for FGVR. The source code is available at: https://github.com/BeCarefulOfYournaoke/BuCSFR.
Cite
Text
Shi et al. "Learning Separable Fine-Grained Representation via Dendrogram Construction from Coarse Labels for Fine-Grained Visual Recognition." International Conference on Computer Vision, 2025.Markdown
[Shi et al. "Learning Separable Fine-Grained Representation via Dendrogram Construction from Coarse Labels for Fine-Grained Visual Recognition." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/shi2025iccv-learning/)BibTeX
@inproceedings{shi2025iccv-learning,
title = {{Learning Separable Fine-Grained Representation via Dendrogram Construction from Coarse Labels for Fine-Grained Visual Recognition}},
author = {Shi, Guanghui and Liang, Xuefeng and Li, Wenjie and Lin, Xiaoyu},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {870-879},
url = {https://mlanthology.org/iccv/2025/shi2025iccv-learning/}
}