Meta-Learning to Teach Semantic Prompts for Open Domain Generalization in Vision-Language Models

Abstract

Open Domain Generalization (ODG) addresses the challenges posed by domain and category shifts between labeled training sources and unlabeled target domains. Current state-of-the-art methods struggle with the limitations of traditional CNN backbones, leading to reduced generalization and increased error rates in detecting target open samples without prior knowledge. Additionally, recent CLIP-based prompt learning approaches fail to distinguish between known and unknown classes effectively, resulting in suboptimal performance. To address these challenges, we propose MetaPrompt, which leverages the semantic strengths of the vision-language model CLIP and the ''learning-to-learn'' capabilities of Meta-Learning to achieve robust generalization across domain and category shifts. Our framework introduces three key innovations: First, we approach ODG as a multi-class classification problem that includes both known and novel categories, designing novel prompts capable of detecting unknown class samples across multiple domains. These prompts are trained using Meta-Learning with momentum updates, enabling smooth and accurate differentiation between known and unknown classes. Second, we introduce a novel domain-agnostic semantic attention-based prompt alongside domain-focused prompts to enhance robustness in classifying unknown classes across various domains. Finally, we incorporate an unsupervised contrastive loss during episodic Meta-Training, which reinforces the boundaries in the metric space between known and unknown classes, thereby enhancing ''unknown'' class awareness in the prompts. MetaPrompt has demonstrated its superiority through extensive testing on diverse datasets, excelling in both closed and open-set DG scenarios and consistently outperforming existing solutions.

Cite

Text

Bose et al. "Meta-Learning to Teach Semantic Prompts for Open Domain Generalization in Vision-Language Models." Transactions on Machine Learning Research, 2025.

Markdown

[Bose et al. "Meta-Learning to Teach Semantic Prompts for Open Domain Generalization in Vision-Language Models." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/bose2025tmlr-metalearning/)

BibTeX

@article{bose2025tmlr-metalearning,
  title     = {{Meta-Learning to Teach Semantic Prompts for Open Domain Generalization in Vision-Language Models}},
  author    = {Bose, Shirsha and Singha, Mainak and Jha, Ankit and Mukhopadhyay, Souradeep and Banerjee, Biplab},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/bose2025tmlr-metalearning/}
}