Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models

Abstract

Existing adaptation methods of pre-trained vision-language models like CLIP often rely on base-class samples during fine-tuning, introducing systematic biases that distort decision boundaries and degrade performance on novel classes. In this work, we break new ground by proposing a hierarchical divide-and-conquer framework that addresses classification bias at its root. Our method first segregates the label space into base and novel subspaces, ensuring domain separation. Subsequently, it employs text-embedding clustering within each subspace to decompose ambiguous intra-domain classes into disentangled, fine-grained clusters. This two-stage grouping strategy not only alleviates class confusion but also enables domain-specific model training in isolated subspaces, fostering specialized learning without overfitting base categories. Experiments on three classification benchmarks reveal that our approach achieves state-of-the-art performance, surpassing the second-best competitor by 10% average accuracy.

Cite

Text

Lu et al. "Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models." International Conference on Computer Vision, 2025.

Markdown

[Lu et al. "Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/lu2025iccv-hierarchical/)

BibTeX

@inproceedings{lu2025iccv-hierarchical,
  title     = {{Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models}},
  author    = {Lu, Ziqian and Yu, Yunlong and Tong, Qinyue and Liu, Jun},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {3575-3584},
  url       = {https://mlanthology.org/iccv/2025/lu2025iccv-hierarchical/}
}