Domain-Generalizable Multiple-Domain Clustering

Amit Rozner, Barak Battash, Lior Wolf, Ofir Lindenbaum

TMLR 2024

/tmlr/2024/rozner2024tmlr-domaingeneralizable/

Abstract

This work generalizes the problem of unsupervised domain generalization to the case in which no labeled samples are available (completely unsupervised). We are given unlabeled samples from multiple source domains, and we aim to learn a shared predictor that assigns examples to semantically related clusters. Evaluation is done by predicting cluster assignments in previously unseen domains. Towards this goal, we propose a two-stage training framework: (1) self-supervised pre-training for extracting domain invariant semantic features. (2) multi-head cluster prediction with pseudo labels, which rely on both the feature space and cluster head prediction, further leveraging a novel prediction-based label smoothing scheme. We demonstrate empirically that our model is more accurate than baselines that require fine-tuning using samples from the target domain or some level of supervision. Our code is available at \url{https://github.com/AmitRozner/domain-generalizable-multiple-domain-clustering}.

PDF TMLR Code Semantic Scholar

Cite

Text

Rozner et al. "Domain-Generalizable Multiple-Domain Clustering." Transactions on Machine Learning Research, 2024.

Markdown

[Rozner et al. "Domain-Generalizable Multiple-Domain Clustering." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/rozner2024tmlr-domaingeneralizable/)

BibTeX

@article{rozner2024tmlr-domaingeneralizable,
  title     = {{Domain-Generalizable Multiple-Domain Clustering}},
  author    = {Rozner, Amit and Battash, Barak and Wolf, Lior and Lindenbaum, Ofir},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/rozner2024tmlr-domaingeneralizable/}
}